2017-06-29 09:22:17

by Abdul Haleem

[permalink] [raw]
Subject: [linux-next] cpus stalls detected few hours after booting next kernel

Hi,

On a PowerPC bare-metal machine running 4.12.0-rc7-next-20170628 kernel.

An hour after the kernel is booted, dmesg is flooded with CPU stalls
messages and NMI backtraces.

Tests: No Tests, just keep machine idle for few hours after boot.
Machine Type: Power 8 Bare-Metal
Kernel : 4.12.0-rc7-next-20170628
gcc: 4.8.5 20150623


Trace logs:
-----------

[ 4255.148372] INFO: rcu_sched detected stalls on CPUs/tasks:
[ 4255.148446] 0-...: (30267 GPs behind) idle=238/0/0 softirq=504/504 fqs=0
[ 4255.148462] 1-...: (665 GPs behind) idle=6f0/0/0 softirq=347/347 fqs=0
[ 4255.148477] 2-...: (409 GPs behind) idle=8ac/0/0 softirq=496/496 fqs=0
[ 4255.148542] 6-...: (27907 GPs behind) idle=808/0/0 softirq=2609/2610 fqs=0
[ 4255.148618] 8-...: (2672 GPs behind) idle=c78/0/0 softirq=338/338 fqs=0
[ 4255.148682] 9-...: (2229 GPs behind) idle=f18/0/0 softirq=432/432 fqs=0
[ 4255.148745] 10-...: (24370 GPs behind) idle=a34/0/0 softirq=797/798 fqs=0
[ 4255.148809] 11-...: (24870 GPs behind) idle=848/0/0 softirq=427/427 fqs=0
[ 4255.148873] 12-...: (24870 GPs behind) idle=11c/0/0 softirq=297/297 fqs=0
[ 4255.148936] 13-...: (30549 GPs behind) idle=d44/0/0 softirq=278/278 fqs=0
[ 4255.148999] 14-...: (30555 GPs behind) idle=420/0/0 softirq=359/359 fqs=0
[ 4255.149063] 15-...: (30255 GPs behind) idle=1dc/0/0 softirq=416/416 fqs=0
[ 4255.149127] 16-...: (30531 GPs behind) idle=a58/0/0 softirq=275/275 fqs=0
[ 4255.149191] 18-...: (30531 GPs behind) idle=b14/0/0 softirq=310/310 fqs=0
[ 4255.149254] 19-...: (30558 GPs behind) idle=174/0/0 softirq=290/291 fqs=0
[ 4255.149318] 20-...: (30012 GPs behind) idle=0f0/0/0 softirq=495/495 fqs=0
[ 4255.149382] 21-...: (3782 GPs behind) idle=438/0/0 softirq=445/445 fqs=0
[ 4255.149445] 22-...: (29534 GPs behind) idle=b1c/0/0 softirq=361/361 fqs=0
[ 4255.149508] 23-...: (1019 GPs behind) idle=3d0/0/0 softirq=308/308 fqs=0
[ 4255.149572] 24-...: (30558 GPs behind) idle=ef0/0/0 softirq=643/643 fqs=0
[ 4255.149636] 25-...: (30556 GPs behind) idle=de8/0/0 softirq=495/495 fqs=0
[ 4255.149699] 26-...: (30531 GPs behind) idle=740/0/0 softirq=310/310 fqs=0
[ 4255.149763] 27-...: (30531 GPs behind) idle=5cc/0/0 softirq=304/304 fqs=0
[ 4255.149826] 28-...: (9032 GPs behind) idle=b9c/0/0 softirq=308/308 fqs=0
[ 4255.149890] 29-...: (30552 GPs behind) idle=fe8/0/0 softirq=304/304 fqs=0
[ 4255.149953] 30-...: (30531 GPs behind) idle=a3c/0/0 softirq=473/473 fqs=0
[ 4255.150017] 31-...: (30555 GPs behind) idle=cfc/0/0 softirq=308/308 fqs=0
[ 4255.150081] 32-...: (29496 GPs behind) idle=be4/0/0 softirq=268/268 fqs=0
[ 4255.150144] 34-...: (4884 GPs behind) idle=af8/0/0 softirq=333/333 fqs=0
[ 4255.150208] 35-...: (4885 GPs behind) idle=c84/0/0 softirq=6151/6151 fqs=0
[ 4255.150283] 36-...: (14337 GPs behind) idle=23c/0/0 softirq=946/946 fqs=0
[ 4255.150347] 37-...: (30220 GPs behind) idle=800/0/0 softirq=288/288 fqs=0
[ 4255.150410] 38-...: (30217 GPs behind) idle=068/0/0 softirq=332/332 fqs=0
[ 4255.150474] 39-...: (30078 GPs behind) idle=f04/0/0 softirq=270/270 fqs=0
[ 4255.150538] 40-...: (18 GPs behind) idle=d00/0/0 softirq=1235/1235 fqs=0
[ 4255.150602] 64-...: (209 GPs behind) idle=d74/0/0 softirq=2358/2358 fqs=0
[ 4255.150665] 65-...: (204 GPs behind) idle=9b0/0/0 softirq=2531/2531 fqs=0
[ 4255.150729] 66-...: (273 GPs behind) idle=bbc/0/0 softirq=2684/2684 fqs=0
[ 4255.150793] 69-...: (364 GPs behind) idle=b64/0/0 softirq=2801/2801 fqs=0
[ 4255.150856] 72-...: (30 GPs behind) idle=e28/0/0 softirq=1381/1381 fqs=0
[ 4255.150920] 77-...: (7161 GPs behind) idle=acc/0/0 softirq=7358/7358 fqs=0
[ 4255.150996] 78-...: (1 GPs behind) idle=b68/0/0 softirq=11962/11967 fqs=0
[ 4255.151060] 79-...: (119 GPs behind) idle=bf0/0/0 softirq=6186/6186 fqs=0
[ 4255.151121] (detected by 5, t=2102 jiffies, g=30521, c=30520, q=1069)
[ 4255.151192] Sending NMI from CPU 5 to CPUs 0:
[ 4255.151246] NMI backtrace for cpu 0
[ 4255.151287] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc7-next-20170628 #2
[ 4255.151363] task: c0000007f8495600 task.stack: c0000007f842c000
[ 4255.151428] NIP: c00000000000adb4 LR: c000000000015584 CTR: c00000000082f4b0
[ 4255.151504] REGS: c0000007f842fb60 TRAP: 0e81 Not tainted (4.12.0-rc7-next-20170628)
[ 4255.151578] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
[ 4255.151586] CR: 22004884 XER: 00000000
[ 4255.151675] CFAR: c00000000062c108 SOFTE: 1
[ 4255.151675] GPR00: c00000000082d6c8 c0000007f842fde0 c000000001062b00 0000000028000000
[ 4255.151675] GPR04: 0000000000000003 c000000000089830 00003aa8056bc35f 0000000000000001
[ 4255.151675] GPR08: 0000000000000002 c000000000d52d80 00000007fe7d0000 9000000000001003
[ 4255.151675] GPR12: c00000000082a0c0 c00000000fd40000
[ 4255.152217] NIP [c00000000000adb4] .L__replay_interrupt_return+0x0/0x4
[ 4255.152334] LR [c000000000015584] arch_local_irq_restore+0x74/0x90
[ 4255.152447] Call Trace:
[ 4255.152499] [c0000007f842fde0] [c00000000017cec0] tick_broadcast_oneshot_control+0x40/0x60 (unreliable)
[ 4255.152662] [c0000007f842fe00] [c00000000082d6c8] cpuidle_enter_state+0x108/0x3d0
[ 4255.152803] [c0000007f842fe60] [c000000000133e94] call_cpuidle+0x44/0x80
[ 4255.152921] [c0000007f842fe80] [c000000000134240] do_idle+0x290/0x2f0
[ 4255.153037] [c0000007f842fef0] [c000000000134474] cpu_startup_entry+0x34/0x40
[ 4255.153176] [c0000007f842ff20] [c000000000041944] start_secondary+0x304/0x360
[ 4255.153316] [c0000007f842ff90] [c00000000000b16c] start_secondary_prolog+0x10/0x14
[ 4255.153455] Instruction dump:
[ 4255.153527] 7d200026 618c8000 2c030900 4182e320 2c030500 4182dd68 2c030e80 4182ffa4
[ 4255.153668] 2c030ea0 4182f078 2c030e60 4182edb0 <4e800020> 7c781b78 480003c9 480003e1


Regard's
Abdul Haleem
IBM Linux Technology Center.


Attachments:
dmesglogs.txt (169.07 kB)
Tul-NV-config (84.68 kB)
Download all attachments

2017-06-29 09:41:38

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

I've seen this as well (or something like it) in mambo at boot, but
it's pretty rare to hit. I'm trying to debug it.

I'm guessing possibly an idle vs interrupt race.

> [ 4255.151192] Sending NMI from CPU 5 to CPUs 0:
> [ 4255.151246] NMI backtrace for cpu 0
> [ 4255.151287] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc7-next-20170628 #2
> [ 4255.151363] task: c0000007f8495600 task.stack: c0000007f842c000
> [ 4255.151428] NIP: c00000000000adb4 LR: c000000000015584 CTR: c00000000082f4b0
> [ 4255.151504] REGS: c0000007f842fb60 TRAP: 0e81 Not tainted (4.12.0-rc7-next-20170628)
> [ 4255.151578] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
> [ 4255.151586] CR: 22004884 XER: 00000000
> [ 4255.151675] CFAR: c00000000062c108 SOFTE: 1
> [ 4255.151675] GPR00: c00000000082d6c8 c0000007f842fde0 c000000001062b00 0000000028000000
> [ 4255.151675] GPR04: 0000000000000003 c000000000089830 00003aa8056bc35f 0000000000000001
> [ 4255.151675] GPR08: 0000000000000002 c000000000d52d80 00000007fe7d0000 9000000000001003
> [ 4255.151675] GPR12: c00000000082a0c0 c00000000fd40000
> [ 4255.152217] NIP [c00000000000adb4] .L__replay_interrupt_return+0x0/0x4
> [ 4255.152334] LR [c000000000015584] arch_local_irq_restore+0x74/0x90
> [ 4255.152447] Call Trace:
> [ 4255.152499] [c0000007f842fde0] [c00000000017cec0] tick_broadcast_oneshot_control+0x40/0x60 (unreliable)
> [ 4255.152662] [c0000007f842fe00] [c00000000082d6c8] cpuidle_enter_state+0x108/0x3d0
> [ 4255.152803] [c0000007f842fe60] [c000000000133e94] call_cpuidle+0x44/0x80
> [ 4255.152921] [c0000007f842fe80] [c000000000134240] do_idle+0x290/0x2f0
> [ 4255.153037] [c0000007f842fef0] [c000000000134474] cpu_startup_entry+0x34/0x40
> [ 4255.153176] [c0000007f842ff20] [c000000000041944] start_secondary+0x304/0x360
> [ 4255.153316] [c0000007f842ff90] [c00000000000b16c] start_secondary_prolog+0x10/0x14
> [ 4255.153455] Instruction dump:
> [ 4255.153527] 7d200026 618c8000 2c030900 4182e320 2c030500 4182dd68 2c030e80 4182ffa4
> [ 4255.153668] 2c030ea0 4182f078 2c030e60 4182edb0 <4e800020> 7c781b78 480003c9 480003e1

What I think you're seeing here is just the lockup IPI causes the CPU to
wake from idle, and so the backtrace gets recorded from where the interrupt
gets replayed when the cpuidle code returns from the low level idle call
then re-enables interrupts.

I don't *think* the replay-wakeup-interrupt patch is directly involved, but
it's likely to be one of the idle patches.

Thanks,
Nick

2017-06-29 10:23:32

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Thu, 29 Jun 2017 19:36:14 +1000
Nicholas Piggin <[email protected]> wrote:

> I've seen this as well (or something like it) in mambo at boot, but
> it's pretty rare to hit. I'm trying to debug it.
>
> I'm guessing possibly an idle vs interrupt race.
>
> > [ 4255.151192] Sending NMI from CPU 5 to CPUs 0:
> > [ 4255.151246] NMI backtrace for cpu 0
> > [ 4255.151287] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.12.0-rc7-next-20170628 #2
> > [ 4255.151363] task: c0000007f8495600 task.stack: c0000007f842c000
> > [ 4255.151428] NIP: c00000000000adb4 LR: c000000000015584 CTR: c00000000082f4b0
> > [ 4255.151504] REGS: c0000007f842fb60 TRAP: 0e81 Not tainted (4.12.0-rc7-next-20170628)
> > [ 4255.151578] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>
> > [ 4255.151586] CR: 22004884 XER: 00000000
> > [ 4255.151675] CFAR: c00000000062c108 SOFTE: 1
> > [ 4255.151675] GPR00: c00000000082d6c8 c0000007f842fde0 c000000001062b00 0000000028000000
> > [ 4255.151675] GPR04: 0000000000000003 c000000000089830 00003aa8056bc35f 0000000000000001
> > [ 4255.151675] GPR08: 0000000000000002 c000000000d52d80 00000007fe7d0000 9000000000001003
> > [ 4255.151675] GPR12: c00000000082a0c0 c00000000fd40000
> > [ 4255.152217] NIP [c00000000000adb4] .L__replay_interrupt_return+0x0/0x4
> > [ 4255.152334] LR [c000000000015584] arch_local_irq_restore+0x74/0x90
> > [ 4255.152447] Call Trace:
> > [ 4255.152499] [c0000007f842fde0] [c00000000017cec0] tick_broadcast_oneshot_control+0x40/0x60 (unreliable)
> > [ 4255.152662] [c0000007f842fe00] [c00000000082d6c8] cpuidle_enter_state+0x108/0x3d0
> > [ 4255.152803] [c0000007f842fe60] [c000000000133e94] call_cpuidle+0x44/0x80
> > [ 4255.152921] [c0000007f842fe80] [c000000000134240] do_idle+0x290/0x2f0
> > [ 4255.153037] [c0000007f842fef0] [c000000000134474] cpu_startup_entry+0x34/0x40
> > [ 4255.153176] [c0000007f842ff20] [c000000000041944] start_secondary+0x304/0x360
> > [ 4255.153316] [c0000007f842ff90] [c00000000000b16c] start_secondary_prolog+0x10/0x14
> > [ 4255.153455] Instruction dump:
> > [ 4255.153527] 7d200026 618c8000 2c030900 4182e320 2c030500 4182dd68 2c030e80 4182ffa4
> > [ 4255.153668] 2c030ea0 4182f078 2c030e60 4182edb0 <4e800020> 7c781b78 480003c9 480003e1
>
> What I think you're seeing here is just the lockup IPI causes the CPU to
> wake from idle, and so the backtrace gets recorded from where the interrupt
> gets replayed when the cpuidle code returns from the low level idle call
> then re-enables interrupts.
>
> I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> it's likely to be one of the idle patches.

Although you have this in the backtrace. I wonder if that's a stuck
lock in rcu_process_callbacks?


[ 5948.345539] CPU: 63 PID: 7360 Comm: rs:main Q:Reg Not tainted \
4.12.0-rc7-next-20170628 #2 [ 5948.345612] task: c000000f1c14e600 task.stack: \
c000000f1c1e8000 [ 5948.345672] NIP: c0000000009c7a10 LR: c0000000009c7a08 CTR: \
c00000000015eda0 [ 5948.358553] REGS: c000000f1c1eb150 TRAP: 0501 Not tainted \
(4.12.0-rc7-next-20170628) [ 5948.358623] MSR: 9000000000009033 \
<SF,HV,EE,ME,IR,DR,RI,LE> [ 5948.358626] CR: 28082884 XER: 20000000
[ 5948.358706] CFAR: c0000000009c7a28 SOFTE: 1
GPR00: c00000000015f14c c000000f1c1eb3d0 c000000001062b00 0000000000000001
GPR04: c000000fff6e6180 0000000000000000 0000000000000001 00000000000000cc
GPR08: 0000000000000001 000000008000004a 0000000000000000 0000000000000000
GPR12: c00000000015eda0 c00000000fd55a80
[ 5948.358986] NIP [c0000000009c7a10] _raw_spin_lock_irqsave+0x90/0x100
[ 5948.359043] LR [c0000000009c7a08] _raw_spin_lock_irqsave+0x88/0x100
[ 5948.359099] Call Trace:
[ 5948.359123] [c000000f1c1eb3d0] [0000000000000001] 0x1 (unreliable)
[ 5948.359182] [c000000f1c1eb410] [c00000000015f14c] \
rcu_process_callbacks+0x3ac/0x620 [ 5948.359252] [c000000f1c1eb4c0] \
[c0000000000e1e0c] __do_softirq+0x14c/0x3a0 [ 5948.365958] [c000000f1c1eb5b0] \
[c0000000000e2448] irq_exit+0x108/0x120 [ 5948.366016] [c000000f1c1eb5d0] \
[c0000000000232b4] timer_interrupt+0xa4/0xe0 [ 5948.366075] [c000000f1c1eb600] \
[c000000000009208] decrementer_common+0x158/0x160 [ 5948.366149] --- interrupt: 901 \
at start_this_handle+0xd0/0x4b0 LR = jbd2__journal_start+0x17c/0x2b0
[ 5948.366242] [c000000f1c1eb8f0] [c000000f20a5cb00] 0xc000000f20a5cb00 (unreliable)
[ 5948.366314] [c000000f1c1eba00] [c00000000040717c] jbd2__journal_start+0x17c/0x2b0
[ 5948.366388] [c000000f1c1eba70] [c00000000038edf4] \
__ext4_journal_start_sb+0x84/0x180 [ 5948.366459] [c000000f1c1ebad0] \
[c0000000003b47dc] ext4_da_write_begin+0x17c/0x520 [ 5948.366532] [c000000f1c1ebb90] \
[c00000000021f9c8] generic_perform_write+0xe8/0x250 [ 5948.366604] [c000000f1c1ebc20] \
[c000000000220d20] __generic_file_write_iter+0x200/0x240 [ 5948.366677] \
[c000000f1c1ebc80] [c00000000039d614] ext4_file_write_iter+0x2e4/0x4d0 [ 5948.373255] \
[c000000f1c1ebd00] [c0000000002e13c0] __vfs_write+0x120/0x200 [ 5948.373313] \
[c000000f1c1ebd90] [c0000000002e2c48] vfs_write+0xc8/0x240 [ 5948.373371] \
[c000000f1c1ebde0] [c0000000002e4940] SyS_write+0x60/0x110 [ 5948.373430] \
[c000000f1c1ebe30] [c00000000000b8e0] system_call+0x38/0xdc [ 5948.373486] \
Instruction dump: [ 5948.373521] 7fe3fb78 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 \
7c0803a6 4e800020 8bad028a [ 5948.373592] 7fe3fb78 4b64db15 60000000 7c210b78 \
<e92d0000> 89290009 792affe3 40820048 [ 5948.374515] Sending NMI from CPU 74 to CPUs \

2017-06-29 14:45:55

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Thu, 29 Jun 2017 20:23:05 +1000
Nicholas Piggin <[email protected]> wrote:

> On Thu, 29 Jun 2017 19:36:14 +1000
> Nicholas Piggin <[email protected]> wrote:

> > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > it's likely to be one of the idle patches.

Okay this turned out to be misconfigured sleep states I added for the
simulator, sorry for the false alarm.

> Although you have this in the backtrace. I wonder if that's a stuck
> lock in rcu_process_callbacks?

So this spinlock becomes top of the list of suspects. Can you try
enabling lockdep and try to reproduce it?

> [ 5948.345539] CPU: 63 PID: 7360 Comm: rs:main Q:Reg Not tainted \
> 4.12.0-rc7-next-20170628 #2 [ 5948.345612] task: c000000f1c14e600 task.stack: \
> c000000f1c1e8000 [ 5948.345672] NIP: c0000000009c7a10 LR: c0000000009c7a08 CTR: \
> c00000000015eda0 [ 5948.358553] REGS: c000000f1c1eb150 TRAP: 0501 Not tainted \
> (4.12.0-rc7-next-20170628) [ 5948.358623] MSR: 9000000000009033 \
> <SF,HV,EE,ME,IR,DR,RI,LE> [ 5948.358626] CR: 28082884 XER: 20000000
> [ 5948.358706] CFAR: c0000000009c7a28 SOFTE: 1
> GPR00: c00000000015f14c c000000f1c1eb3d0 c000000001062b00 0000000000000001
> GPR04: c000000fff6e6180 0000000000000000 0000000000000001 00000000000000cc
> GPR08: 0000000000000001 000000008000004a 0000000000000000 0000000000000000
> GPR12: c00000000015eda0 c00000000fd55a80
> [ 5948.358986] NIP [c0000000009c7a10] _raw_spin_lock_irqsave+0x90/0x100
> [ 5948.359043] LR [c0000000009c7a08] _raw_spin_lock_irqsave+0x88/0x100
> [ 5948.359099] Call Trace:
> [ 5948.359123] [c000000f1c1eb3d0] [0000000000000001] 0x1 (unreliable)
> [ 5948.359182] [c000000f1c1eb410] [c00000000015f14c] \
d> rcu_process_callbacks+0x3ac/0x620 [ 5948.359252] [c000000f1c1eb4c0] \
> [c0000000000e1e0c] __do_softirq+0x14c/0x3a0 [ 5948.365958] [c000000f1c1eb5b0] \
> [c0000000000e2448] irq_exit+0x108/0x120 [ 5948.366016] [c000000f1c1eb5d0] \
> [c0000000000232b4] timer_interrupt+0xa4/0xe0 [ 5948.366075] [c000000f1c1eb600] \
> [c000000000009208] decrementer_common+0x158/0x160 [ 5948.366149] --- interrupt: 901 \
> at start_this_handle+0xd0/0x4b0 LR = jbd2__journal_start+0x17c/0x2b0
> [ 5948.366242] [c000000f1c1eb8f0] [c000000f20a5cb00] 0xc000000f20a5cb00 (unreliable)
> [ 5948.366314] [c000000f1c1eba00] [c00000000040717c] jbd2__journal_start+0x17c/0x2b0
> [ 5948.366388] [c000000f1c1eba70] [c00000000038edf4] \
> __ext4_journal_start_sb+0x84/0x180 [ 5948.366459] [c000000f1c1ebad0] \
> [c0000000003b47dc] ext4_da_write_begin+0x17c/0x520 [ 5948.366532] [c000000f1c1ebb90] \
> [c00000000021f9c8] generic_perform_write+0xe8/0x250 [ 5948.366604] [c000000f1c1ebc20] \
> [c000000000220d20] __generic_file_write_iter+0x200/0x240 [ 5948.366677] \
> [c000000f1c1ebc80] [c00000000039d614] ext4_file_write_iter+0x2e4/0x4d0 [ 5948.373255] \
> [c000000f1c1ebd00] [c0000000002e13c0] __vfs_write+0x120/0x200 [ 5948.373313] \
> [c000000f1c1ebd90] [c0000000002e2c48] vfs_write+0xc8/0x240 [ 5948.373371] \
> [c000000f1c1ebde0] [c0000000002e4940] SyS_write+0x60/0x110 [ 5948.373430] \
> [c000000f1c1ebe30] [c00000000000b8e0] system_call+0x38/0xdc [ 5948.373486] \
> Instruction dump: [ 5948.373521] 7fe3fb78 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 \
> 7c0803a6 4e800020 8bad028a [ 5948.373592] 7fe3fb78 4b64db15 60000000 7c210b78 \
> <e92d0000> 89290009 792affe3 40820048 [ 5948.374515] Sending NMI from CPU 74 to CPUs \
>

2017-06-30 05:23:54

by Abdul Haleem

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> On Thu, 29 Jun 2017 20:23:05 +1000
> Nicholas Piggin <[email protected]> wrote:
>
> > On Thu, 29 Jun 2017 19:36:14 +1000
> > Nicholas Piggin <[email protected]> wrote:
>
> > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > it's likely to be one of the idle patches.
>
> Okay this turned out to be misconfigured sleep states I added for the
> simulator, sorry for the false alarm.
>
> > Although you have this in the backtrace. I wonder if that's a stuck
> > lock in rcu_process_callbacks?
>
> So this spinlock becomes top of the list of suspects. Can you try
> enabling lockdep and try to reproduce it?

Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
I do not see any difference in trace messages with and without LOCKDEP
enabled.

Please find the attached log file.

>
> > [ 5948.345539] CPU: 63 PID: 7360 Comm: rs:main Q:Reg Not tainted \
> > 4.12.0-rc7-next-20170628 #2 [ 5948.345612] task: c000000f1c14e600 task.stack: \
> > c000000f1c1e8000 [ 5948.345672] NIP: c0000000009c7a10 LR: c0000000009c7a08 CTR: \
> > c00000000015eda0 [ 5948.358553] REGS: c000000f1c1eb150 TRAP: 0501 Not tainted \
> > (4.12.0-rc7-next-20170628) [ 5948.358623] MSR: 9000000000009033 \
> > <SF,HV,EE,ME,IR,DR,RI,LE> [ 5948.358626] CR: 28082884 XER: 20000000
> > [ 5948.358706] CFAR: c0000000009c7a28 SOFTE: 1
> > GPR00: c00000000015f14c c000000f1c1eb3d0 c000000001062b00 0000000000000001
> > GPR04: c000000fff6e6180 0000000000000000 0000000000000001 00000000000000cc
> > GPR08: 0000000000000001 000000008000004a 0000000000000000 0000000000000000
> > GPR12: c00000000015eda0 c00000000fd55a80
> > [ 5948.358986] NIP [c0000000009c7a10] _raw_spin_lock_irqsave+0x90/0x100
> > [ 5948.359043] LR [c0000000009c7a08] _raw_spin_lock_irqsave+0x88/0x100
> > [ 5948.359099] Call Trace:
> > [ 5948.359123] [c000000f1c1eb3d0] [0000000000000001] 0x1 (unreliable)
> > [ 5948.359182] [c000000f1c1eb410] [c00000000015f14c] \
> d> rcu_process_callbacks+0x3ac/0x620 [ 5948.359252] [c000000f1c1eb4c0] \
> > [c0000000000e1e0c] __do_softirq+0x14c/0x3a0 [ 5948.365958] [c000000f1c1eb5b0] \
> > [c0000000000e2448] irq_exit+0x108/0x120 [ 5948.366016] [c000000f1c1eb5d0] \
> > [c0000000000232b4] timer_interrupt+0xa4/0xe0 [ 5948.366075] [c000000f1c1eb600] \
> > [c000000000009208] decrementer_common+0x158/0x160 [ 5948.366149] --- interrupt: 901 \
> > at start_this_handle+0xd0/0x4b0 LR = jbd2__journal_start+0x17c/0x2b0
> > [ 5948.366242] [c000000f1c1eb8f0] [c000000f20a5cb00] 0xc000000f20a5cb00 (unreliable)
> > [ 5948.366314] [c000000f1c1eba00] [c00000000040717c] jbd2__journal_start+0x17c/0x2b0
> > [ 5948.366388] [c000000f1c1eba70] [c00000000038edf4] \
> > __ext4_journal_start_sb+0x84/0x180 [ 5948.366459] [c000000f1c1ebad0] \
> > [c0000000003b47dc] ext4_da_write_begin+0x17c/0x520 [ 5948.366532] [c000000f1c1ebb90] \
> > [c00000000021f9c8] generic_perform_write+0xe8/0x250 [ 5948.366604] [c000000f1c1ebc20] \
> > [c000000000220d20] __generic_file_write_iter+0x200/0x240 [ 5948.366677] \
> > [c000000f1c1ebc80] [c00000000039d614] ext4_file_write_iter+0x2e4/0x4d0 [ 5948.373255] \
> > [c000000f1c1ebd00] [c0000000002e13c0] __vfs_write+0x120/0x200 [ 5948.373313] \
> > [c000000f1c1ebd90] [c0000000002e2c48] vfs_write+0xc8/0x240 [ 5948.373371] \
> > [c000000f1c1ebde0] [c0000000002e4940] SyS_write+0x60/0x110 [ 5948.373430] \
> > [c000000f1c1ebe30] [c00000000000b8e0] system_call+0x38/0xdc [ 5948.373486] \
> > Instruction dump: [ 5948.373521] 7fe3fb78 e8010010 eba1ffe8 ebc1fff0 ebe1fff8 \
> > 7c0803a6 4e800020 8bad028a [ 5948.373592] 7fe3fb78 4b64db15 60000000 7c210b78 \
> > <e92d0000> 89290009 792affe3 40820048 [ 5948.374515] Sending NMI from CPU 74 to CPUs \
> >
>


--
Regard's

Abdul Haleem
IBM Linux Technology Centre



Attachments:
trace-with-lockdep (52.71 kB)

2017-06-30 07:28:20

by Nicholas Piggin

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, 30 Jun 2017 10:52:18 +0530
Abdul Haleem <[email protected]> wrote:

> On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > On Thu, 29 Jun 2017 20:23:05 +1000
> > Nicholas Piggin <[email protected]> wrote:
> >
> > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > Nicholas Piggin <[email protected]> wrote:
> >
> > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > it's likely to be one of the idle patches.
> >
> > Okay this turned out to be misconfigured sleep states I added for the
> > simulator, sorry for the false alarm.
> >
> > > Although you have this in the backtrace. I wonder if that's a stuck
> > > lock in rcu_process_callbacks?
> >
> > So this spinlock becomes top of the list of suspects. Can you try
> > enabling lockdep and try to reproduce it?
>
> Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> I do not see any difference in trace messages with and without LOCKDEP
> enabled.
>
> Please find the attached log file.

Can you get an rcu_invoke_callback event trace that Paul suggested?

Does this bug show up with just the powerpc next branch?

Thanks,
Nick

2017-06-30 18:15:17

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, Jun 30, 2017 at 05:28:02PM +1000, Nicholas Piggin wrote:
> On Fri, 30 Jun 2017 10:52:18 +0530
> Abdul Haleem <[email protected]> wrote:
>
> > On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > > On Thu, 29 Jun 2017 20:23:05 +1000
> > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > > it's likely to be one of the idle patches.
> > >
> > > Okay this turned out to be misconfigured sleep states I added for the
> > > simulator, sorry for the false alarm.
> > >
> > > > Although you have this in the backtrace. I wonder if that's a stuck
> > > > lock in rcu_process_callbacks?
> > >
> > > So this spinlock becomes top of the list of suspects. Can you try
> > > enabling lockdep and try to reproduce it?
> >
> > Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> > I do not see any difference in trace messages with and without LOCKDEP
> > enabled.
> >
> > Please find the attached log file.
>
> Can you get an rcu_invoke_callback event trace that Paul suggested?
>
> Does this bug show up with just the powerpc next branch?

And I must say that those RCU CPU stall warnings are spectacular!

Did you by chance boot with a small subset of the CPUs online, and
bring the rest online later on?

Thanx, Paul

2017-07-04 14:17:22

by Abdul Haleem

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, 2017-06-30 at 17:28 +1000, Nicholas Piggin wrote:
> On Fri, 30 Jun 2017 10:52:18 +0530
> Abdul Haleem <[email protected]> wrote:
>
> > On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > > On Thu, 29 Jun 2017 20:23:05 +1000
> > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > > it's likely to be one of the idle patches.
> > >
> > > Okay this turned out to be misconfigured sleep states I added for the
> > > simulator, sorry for the false alarm.
> > >
> > > > Although you have this in the backtrace. I wonder if that's a stuck
> > > > lock in rcu_process_callbacks?
> > >
> > > So this spinlock becomes top of the list of suspects. Can you try
> > > enabling lockdep and try to reproduce it?
> >
> > Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> > I do not see any difference in trace messages with and without LOCKDEP
> > enabled.
> >
> > Please find the attached log file.
>
> Can you get an rcu_invoke_callback event trace that Paul suggested?

I could not reproduce the issue with the latest next kernel
(4.12.0-rc7-next-20170703).

> Does this bug show up with just the powerpc next branch?

Perhaps, stress-ng test on today's mainline kernel (4.12.0) triggered a
different trace message and this not reproducible all the time.


stress-ng --io 100 --vm 10 --vm-bytes 100g --timeout 1h --oomable

INFO: rcu_sched self-detected stall on CPU
8-...: (2099 ticks this GP) idle=37e/140000000000001/0
softirq=1404131/1404131 fqs=932
(t=2100 jiffies g=394894 c=394893 q=21997)
Task dump for CPU 8:
kworker/u162:2 R running task 11168 28468 2 0x00000884
Workqueue: writeback wb_workfn (flush-253:1)
Call Trace:
[c00000036e3eb340] [c000000000114480] sched_show_task+0xf0/0x160
(unreliable)
[c00000036e3eb3b0] [c0000000009adc2c] rcu_dump_cpu_stacks+0xd0/0x134
[c00000036e3eb400] [c00000000015e4d0] rcu_check_callbacks+0x8f0/0xaf0
[c00000036e3eb530] [c000000000165e2c] update_process_times+0x3c/0x90
[c00000036e3eb560] [c00000000017b73c] tick_sched_handle.isra.13
+0x2c/0xc0
[c00000036e3eb590] [c00000000017b828] tick_sched_timer+0x58/0xb0
[c00000036e3eb5d0] [c0000000001669e8] __hrtimer_run_queues+0xf8/0x330
[c00000036e3eb650] [c000000000167744] hrtimer_interrupt+0xe4/0x280
[c00000036e3eb710] [c000000000022620] __timer_interrupt+0x90/0x270
[c00000036e3eb760] [c000000000022cf0] timer_interrupt+0xa0/0xe0
[c00000036e3eb790] [c0000000000091e8] decrementer_common+0x158/0x160
--- interrupt: 901 at move_expired_inodes+0x30/0x200
LR = queue_io+0x8c/0x190
[c00000036e3eba80] [c00000036e3ebac0] 0xc00000036e3ebac0 (unreliable)
[c00000036e3ebac0] [c000000000319b24] wb_writeback+0x2b4/0x420
[c00000036e3ebb90] [c00000000031a980] wb_workfn+0xf0/0x4b0
[c00000036e3ebca0] [c0000000000fa160] process_one_work+0x180/0x470
[c00000036e3ebd30] [c0000000000fa6d4] worker_thread+0x284/0x500
[c00000036e3ebdc0] [c000000000101fc0] kthread+0x160/0x1a0
[c00000036e3ebe30] [c00000000000bb60] ret_from_kernel_thread+0x5c/0x7c
INFO: rcu_sched self-detected stall on CPU
51-...: (2099 ticks this GP) idle=7c2/140000000000001/0
softirq=1272749/1272749 fqs=995
(t=2100 jiffies g=394900 c=394899 q=32186)
~

stress-ng : http://kernel.ubuntu.com/git/cking/stress-ng.git

>
> Thanks,
> Nick
>


--
Regard's

Abdul Haleem
IBM Linux Technology Centre



2017-07-04 23:31:28

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Tue, Jul 04, 2017 at 07:44:58PM +0530, Abdul Haleem wrote:
> On Fri, 2017-06-30 at 17:28 +1000, Nicholas Piggin wrote:
> > On Fri, 30 Jun 2017 10:52:18 +0530
> > Abdul Haleem <[email protected]> wrote:
> >
> > > On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > > > On Thu, 29 Jun 2017 20:23:05 +1000
> > > > Nicholas Piggin <[email protected]> wrote:
> > > >
> > > > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > > > Nicholas Piggin <[email protected]> wrote:
> > > >
> > > > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > > > it's likely to be one of the idle patches.
> > > >
> > > > Okay this turned out to be misconfigured sleep states I added for the
> > > > simulator, sorry for the false alarm.
> > > >
> > > > > Although you have this in the backtrace. I wonder if that's a stuck
> > > > > lock in rcu_process_callbacks?
> > > >
> > > > So this spinlock becomes top of the list of suspects. Can you try
> > > > enabling lockdep and try to reproduce it?
> > >
> > > Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> > > I do not see any difference in trace messages with and without LOCKDEP
> > > enabled.
> > >
> > > Please find the attached log file.
> >
> > Can you get an rcu_invoke_callback event trace that Paul suggested?
>
> I could not reproduce the issue with the latest next kernel
> (4.12.0-rc7-next-20170703).
>
> > Does this bug show up with just the powerpc next branch?
>
> Perhaps, stress-ng test on today's mainline kernel (4.12.0) triggered a
> different trace message and this not reproducible all the time.
>
>
> stress-ng --io 100 --vm 10 --vm-bytes 100g --timeout 1h --oomable
>
> INFO: rcu_sched self-detected stall on CPU
> 8-...: (2099 ticks this GP) idle=37e/140000000000001/0
> softirq=1404131/1404131 fqs=932
> (t=2100 jiffies g=394894 c=394893 q=21997)
> Task dump for CPU 8:
> kworker/u162:2 R running task 11168 28468 2 0x00000884
> Workqueue: writeback wb_workfn (flush-253:1)
> Call Trace:
> [c00000036e3eb340] [c000000000114480] sched_show_task+0xf0/0x160
> (unreliable)
> [c00000036e3eb3b0] [c0000000009adc2c] rcu_dump_cpu_stacks+0xd0/0x134
> [c00000036e3eb400] [c00000000015e4d0] rcu_check_callbacks+0x8f0/0xaf0
> [c00000036e3eb530] [c000000000165e2c] update_process_times+0x3c/0x90
> [c00000036e3eb560] [c00000000017b73c] tick_sched_handle.isra.13
> +0x2c/0xc0
> [c00000036e3eb590] [c00000000017b828] tick_sched_timer+0x58/0xb0
> [c00000036e3eb5d0] [c0000000001669e8] __hrtimer_run_queues+0xf8/0x330
> [c00000036e3eb650] [c000000000167744] hrtimer_interrupt+0xe4/0x280
> [c00000036e3eb710] [c000000000022620] __timer_interrupt+0x90/0x270
> [c00000036e3eb760] [c000000000022cf0] timer_interrupt+0xa0/0xe0
> [c00000036e3eb790] [c0000000000091e8] decrementer_common+0x158/0x160
> --- interrupt: 901 at move_expired_inodes+0x30/0x200
> LR = queue_io+0x8c/0x190
> [c00000036e3eba80] [c00000036e3ebac0] 0xc00000036e3ebac0 (unreliable)
> [c00000036e3ebac0] [c000000000319b24] wb_writeback+0x2b4/0x420

The usual assumption would be that wb_writeback() is looping or...

> [c00000036e3ebb90] [c00000000031a980] wb_workfn+0xf0/0x4b0
> [c00000036e3ebca0] [c0000000000fa160] process_one_work+0x180/0x470
> [c00000036e3ebd30] [c0000000000fa6d4] worker_thread+0x284/0x500

...there is so much work being scheduled that worker_thread() cannot
keep up. If tracing locates such a loop, the usual trick is to
place a cond_resched_rcu_qs() somewhere in it (but of course not
where interrupts or preemption are disabled).

Tracing can help detect this. Or printk()s, for that matter. ;-)

Thanx, Paul

> [c00000036e3ebdc0] [c000000000101fc0] kthread+0x160/0x1a0
> [c00000036e3ebe30] [c00000000000bb60] ret_from_kernel_thread+0x5c/0x7c
> INFO: rcu_sched self-detected stall on CPU
> 51-...: (2099 ticks this GP) idle=7c2/140000000000001/0
> softirq=1272749/1272749 fqs=995
> (t=2100 jiffies g=394900 c=394899 q=32186)
> ~
>
> stress-ng : http://kernel.ubuntu.com/git/cking/stress-ng.git
>
> >
> > Thanks,
> > Nick
> >
>
>
> --
> Regard's
>
> Abdul Haleem
> IBM Linux Technology Centre
>
>
>

2017-07-07 11:06:25

by Abdul Haleem

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, 2017-06-30 at 17:28 +1000, Nicholas Piggin wrote:
> On Fri, 30 Jun 2017 10:52:18 +0530
> Abdul Haleem <[email protected]> wrote:
>
> > On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > > On Thu, 29 Jun 2017 20:23:05 +1000
> > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > > it's likely to be one of the idle patches.
> > >
> > > Okay this turned out to be misconfigured sleep states I added for the
> > > simulator, sorry for the false alarm.
> > >
> > > > Although you have this in the backtrace. I wonder if that's a stuck
> > > > lock in rcu_process_callbacks?
> > >
> > > So this spinlock becomes top of the list of suspects. Can you try
> > > enabling lockdep and try to reproduce it?
> >
> > Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> > I do not see any difference in trace messages with and without LOCKDEP
> > enabled.
> >
> > Please find the attached log file.
>
> Can you get an rcu_invoke_callback event trace that Paul suggested?

Yes, I was able to collect the perf data for rcu_invoke_callback event
on recent next kernel (4.12.0-next-20170705). the issue is rare to hit.

After booting the next kernel, I started this command 'perf record -e
rcu:rcu_invoke_callback -a -g -- cat' and waited for 30 minutes.

five minutes after seeing the stalls messages, I did CTRL-C to end the
perf command.

@Nicholas : the perf.data report is too huge to attach here, shall I
ping you the internal location of file on slack/mail ? Also the machine
is in the same state if you want to use it ?

>
> Does this bug show up with just the powerpc next branch?
>
> Thanks,
> Nick
>


--
Regard's

Abdul Haleem
IBM Linux Technology Centre



2017-07-25 05:20:22

by Abdul Haleem

[permalink] [raw]
Subject: Re: [linux-next] cpus stalls detected few hours after booting next kernel

On Fri, 2017-06-30 at 17:28 +1000, Nicholas Piggin wrote:
> On Fri, 30 Jun 2017 10:52:18 +0530
> Abdul Haleem <[email protected]> wrote:
>
> > On Fri, 2017-06-30 at 00:45 +1000, Nicholas Piggin wrote:
> > > On Thu, 29 Jun 2017 20:23:05 +1000
> > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > On Thu, 29 Jun 2017 19:36:14 +1000
> > > > Nicholas Piggin <[email protected]> wrote:
> > >
> > > > > I don't *think* the replay-wakeup-interrupt patch is directly involved, but
> > > > > it's likely to be one of the idle patches.
> > >
> > > Okay this turned out to be misconfigured sleep states I added for the
> > > simulator, sorry for the false alarm.
> > >
> > > > Although you have this in the backtrace. I wonder if that's a stuck
> > > > lock in rcu_process_callbacks?
> > >
> > > So this spinlock becomes top of the list of suspects. Can you try
> > > enabling lockdep and try to reproduce it?
> >
> > Yes, recreated again with CONFIG_LOCKDEP=y & CONFIG_DEBUG_LOCKDEP=y set.
> > I do not see any difference in trace messages with and without LOCKDEP
> > enabled.
> >
> > Please find the attached log file.
>
> Can you get an rcu_invoke_callback event trace that Paul suggested?

Yes, I have collected the perf report.
>
> Does this bug show up with just the powerpc next branch?

Now started seeing the call trace on mainline too (4.13.0-rc2)

>
> Thanks,
> Nick
>


--
Regard's

Abdul Haleem
IBM Linux Technology Centre