We've found a workqueue stall / deadlock. Our workload is a container-oriented
workload in which we utilize IPv6. Our container (namespace) churn is quite
frequent, and containers can be terminated before their networking is
even setup.
We're running 4.19.73 in production, and in investigation of the underlying
causes, I don't think that future versions of 4.19 fix it.
We've narrowed it down to a lockup between ipv6_addrconf, and cleanup_net.
crash> bt 8
PID: 8 TASK: ffff9a1072b50000 CPU: 24 COMMAND: "kworker/u192:0"
#0 [ffffbfe2c00fbb70] __schedule at ffffffffa7f02bf7
#1 [ffffbfe2c00fbc10] schedule at ffffffffa7f031e8
#2 [ffffbfe2c00fbc18] schedule_timeout at ffffffffa7f0700e
#3 [ffffbfe2c00fbc90] wait_for_completion at ffffffffa7f03b50
#4 [ffffbfe2c00fbce0] __flush_work at ffffffffa76a2532
#5 [ffffbfe2c00fbd58] rollback_registered_many at ffffffffa7dbcdf4
#6 [ffffbfe2c00fbdc0] unregister_netdevice_many at ffffffffa7dbd31e
#7 [ffffbfe2c00fbdd0] default_device_exit_batch at ffffffffa7dbd512
#8 [ffffbfe2c00fbe40] cleanup_net at ffffffffa7dab970
#9 [ffffbfe2c00fbe98] process_one_work at ffffffffa76a17c4
#10 [ffffbfe2c00fbed8] worker_thread at ffffffffa76a19dd
#11 [ffffbfe2c00fbf10] kthread at ffffffffa76a7fd3
#12 [ffffbfe2c00fbf50] ret_from_fork at ffffffffa80001ff
crash> bt 1369493
PID: 1369493 TASK: ffff9a03684d9600 CPU: 58 COMMAND: "kworker/58:1"
#0 [ffffbfe30d68fd48] __schedule at ffffffffa7f02bf7
#1 [ffffbfe30d68fde8] schedule at ffffffffa7f031e8
#2 [ffffbfe30d68fdf0] schedule_preempt_disabled at ffffffffa7f0349a
#3 [ffffbfe30d68fdf8] __mutex_lock at ffffffffa7f04aed
#4 [ffffbfe30d68fe90] addrconf_verify_work at ffffffffa7e8d1aa
#5 [ffffbfe30d68fe98] process_one_work at ffffffffa76a17c4
#6 [ffffbfe30d68fed8] worker_thread at ffffffffa76a19dd
#7 [ffffbfe30d68ff10] kthread at ffffffffa76a7fd3
#8 [ffffbfe30d68ff50] ret_from_fork at ffffffffa80001ff
struct -x mutex.owner.counter rtnl_mutex
owner.counter = 0xffff9a1072b50001
0xffff9a1072b50001 & (~0x07) = 0xffff9a1072b50000
This points back to PID 8 / CPU 24. It is working on cleanup_net, and a part
of cleanup net involves calling ops_exit_list, and as part of that it calls
default_device_exit_batch. default_device_exit_batch takes the rtnl lock before
calling into unregister_netdevice_many, and rollback_registered_many.
rollback_registered_many calls flush_all_backlogs. This will never complete
because it is holding the rtnl lock, and PID 1369493 / CPU 58 is waiting
for rtnl_lock.
If relevant, the workqueue stalls themselves look something like:
BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=0 stuck for 3720s!
BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=-20 stuck for 3719s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 32: cpus=16 node=0 flags=0x0 nice=0 active=2/256
in-flight: 1274779:slab_caches_to_rcu_destroy_workfn slab_caches_to_rcu_destroy_workfn
workqueue events_highpri: flags=0x10
pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
pending: flush_backlog BAR(8)
workqueue events_power_efficient: flags=0x82
pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
in-flight: 1396446:check_lifetime
workqueue mm_percpu_wq: flags=0x8
pwq 140: cpus=70 node=0 flags=0x0 nice=0 active=1/256
pending: vmstat_update
workqueue netns: flags=0xe000a
pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
in-flight: 8:cleanup_net
delayed: cleanup_net
workqueue writeback: flags=0x4e
pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
in-flight: 1334335:wb_workfn
workqueue kblockd: flags=0x18
pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
pending: blk_mq_run_work_fn
workqueue ipv6_addrconf: flags=0x40008
pwq 116: cpus=58 node=0 flags=0x0 nice=0 active=1/1
in-flight: 1369493:addrconf_verify_work
workqueue ena: flags=0xe000a
pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
in-flight: 7505:ena_fw_reset_device [ena]
On 2/11/20 11:23 AM, Sargun Dhillon wrote:
> We've found a workqueue stall / deadlock. Our workload is a container-oriented
> workload in which we utilize IPv6. Our container (namespace) churn is quite
> frequent, and containers can be terminated before their networking is
> even setup.
>
> We're running 4.19.73 in production, and in investigation of the underlying
> causes, I don't think that future versions of 4.19 fix it.
>
> We've narrowed it down to a lockup between ipv6_addrconf, and cleanup_net.
Sure, PID 1369493 addrconf_verify_work() is waiting for RTNL.
But PID 8 ?
__flush_work() is being called.
But from where ? Stacks seem not complete.
>
> crash> bt 8
> PID: 8 TASK: ffff9a1072b50000 CPU: 24 COMMAND: "kworker/u192:0"
> #0 [ffffbfe2c00fbb70] __schedule at ffffffffa7f02bf7
> #1 [ffffbfe2c00fbc10] schedule at ffffffffa7f031e8
> #2 [ffffbfe2c00fbc18] schedule_timeout at ffffffffa7f0700e
> #3 [ffffbfe2c00fbc90] wait_for_completion at ffffffffa7f03b50
> #4 [ffffbfe2c00fbce0] __flush_work at ffffffffa76a2532
> #5 [ffffbfe2c00fbd58] rollback_registered_many at ffffffffa7dbcdf4
> #6 [ffffbfe2c00fbdc0] unregister_netdevice_many at ffffffffa7dbd31e
> #7 [ffffbfe2c00fbdd0] default_device_exit_batch at ffffffffa7dbd512
> #8 [ffffbfe2c00fbe40] cleanup_net at ffffffffa7dab970
> #9 [ffffbfe2c00fbe98] process_one_work at ffffffffa76a17c4
> #10 [ffffbfe2c00fbed8] worker_thread at ffffffffa76a19dd
> #11 [ffffbfe2c00fbf10] kthread at ffffffffa76a7fd3
> #12 [ffffbfe2c00fbf50] ret_from_fork at ffffffffa80001ff
>
> crash> bt 1369493
> PID: 1369493 TASK: ffff9a03684d9600 CPU: 58 COMMAND: "kworker/58:1"
> #0 [ffffbfe30d68fd48] __schedule at ffffffffa7f02bf7
> #1 [ffffbfe30d68fde8] schedule at ffffffffa7f031e8
> #2 [ffffbfe30d68fdf0] schedule_preempt_disabled at ffffffffa7f0349a
> #3 [ffffbfe30d68fdf8] __mutex_lock at ffffffffa7f04aed
> #4 [ffffbfe30d68fe90] addrconf_verify_work at ffffffffa7e8d1aa
> #5 [ffffbfe30d68fe98] process_one_work at ffffffffa76a17c4
> #6 [ffffbfe30d68fed8] worker_thread at ffffffffa76a19dd
> #7 [ffffbfe30d68ff10] kthread at ffffffffa76a7fd3
> #8 [ffffbfe30d68ff50] ret_from_fork at ffffffffa80001ff
>
>
>
> struct -x mutex.owner.counter rtnl_mutex
> owner.counter = 0xffff9a1072b50001
>
> 0xffff9a1072b50001 & (~0x07) = 0xffff9a1072b50000
>
> This points back to PID 8 / CPU 24. It is working on cleanup_net, and a part
> of cleanup net involves calling ops_exit_list, and as part of that it calls
> default_device_exit_batch. default_device_exit_batch takes the rtnl lock before
> calling into unregister_netdevice_many, and rollback_registered_many.
> rollback_registered_many calls flush_all_backlogs. This will never complete
> because it is holding the rtnl lock, and PID 1369493 / CPU 58 is waiting
> for rtnl_lock.
But PID 1369493 is waiting on a mutex, thus properly yielding the cpu.
(schedule() is clearly shown)
This should not prevent other threads
from making progress so that flush_all_backlogs() completes eventually.
flush_all_backlogs() does not care of how many threads are currently blocked
because they can not grab rtnl while flush_all_backlogs() is running.
>
> If relevant, the workqueue stalls themselves look something like:
> BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=0 stuck for 3720s!
> BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=-20 stuck for 3719s!
> Showing busy workqueues and worker pools:
> workqueue events: flags=0x0
> pwq 32: cpus=16 node=0 flags=0x0 nice=0 active=2/256
> in-flight: 1274779:slab_caches_to_rcu_destroy_workfn slab_caches_to_rcu_destroy_workfn
> workqueue events_highpri: flags=0x10
> pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
> pending: flush_backlog BAR(8)
> workqueue events_power_efficient: flags=0x82
> pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
> in-flight: 1396446:check_lifetime
> workqueue mm_percpu_wq: flags=0x8
> pwq 140: cpus=70 node=0 flags=0x0 nice=0 active=1/256
> pending: vmstat_update
> workqueue netns: flags=0xe000a
> pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
> in-flight: 8:cleanup_net
> delayed: cleanup_net
> workqueue writeback: flags=0x4e
> pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
> in-flight: 1334335:wb_workfn
> workqueue kblockd: flags=0x18
> pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
> pending: blk_mq_run_work_fn
> workqueue ipv6_addrconf: flags=0x40008
> pwq 116: cpus=58 node=0 flags=0x0 nice=0 active=1/1
> in-flight: 1369493:addrconf_verify_work
> workqueue ena: flags=0xe000a
> pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
> in-flight: 7505:ena_fw_reset_device [ena]
>
On 2/11/20 9:08 PM, Eric Dumazet wrote:
>
>
> On 2/11/20 11:23 AM, Sargun Dhillon wrote:
>> We've found a workqueue stall / deadlock. Our workload is a container-oriented
>> workload in which we utilize IPv6. Our container (namespace) churn is quite
>> frequent, and containers can be terminated before their networking is
>> even setup.
>>
>> We're running 4.19.73 in production, and in investigation of the underlying
>> causes, I don't think that future versions of 4.19 fix it.
>>
>> We've narrowed it down to a lockup between ipv6_addrconf, and cleanup_net.
>
> Sure, PID 1369493 addrconf_verify_work() is waiting for RTNL.
>
> But PID 8 ?
>
> __flush_work() is being called.
>
> But from where ? Stacks seem not complete.
>
>
>>
>> crash> bt 8
>> PID: 8 TASK: ffff9a1072b50000 CPU: 24 COMMAND: "kworker/u192:0"
>> #0 [ffffbfe2c00fbb70] __schedule at ffffffffa7f02bf7
>> #1 [ffffbfe2c00fbc10] schedule at ffffffffa7f031e8
>> #2 [ffffbfe2c00fbc18] schedule_timeout at ffffffffa7f0700e
>> #3 [ffffbfe2c00fbc90] wait_for_completion at ffffffffa7f03b50
>> #4 [ffffbfe2c00fbce0] __flush_work at ffffffffa76a2532
>> #5 [ffffbfe2c00fbd58] rollback_registered_many at ffffffffa7dbcdf4
>> #6 [ffffbfe2c00fbdc0] unregister_netdevice_many at ffffffffa7dbd31e
>> #7 [ffffbfe2c00fbdd0] default_device_exit_batch at ffffffffa7dbd512
>> #8 [ffffbfe2c00fbe40] cleanup_net at ffffffffa7dab970
>> #9 [ffffbfe2c00fbe98] process_one_work at ffffffffa76a17c4
>> #10 [ffffbfe2c00fbed8] worker_thread at ffffffffa76a19dd
>> #11 [ffffbfe2c00fbf10] kthread at ffffffffa76a7fd3
>> #12 [ffffbfe2c00fbf50] ret_from_fork at ffffffffa80001ff
>>
>> crash> bt 1369493
>> PID: 1369493 TASK: ffff9a03684d9600 CPU: 58 COMMAND: "kworker/58:1"
>> #0 [ffffbfe30d68fd48] __schedule at ffffffffa7f02bf7
>> #1 [ffffbfe30d68fde8] schedule at ffffffffa7f031e8
>> #2 [ffffbfe30d68fdf0] schedule_preempt_disabled at ffffffffa7f0349a
>> #3 [ffffbfe30d68fdf8] __mutex_lock at ffffffffa7f04aed
>> #4 [ffffbfe30d68fe90] addrconf_verify_work at ffffffffa7e8d1aa
>> #5 [ffffbfe30d68fe98] process_one_work at ffffffffa76a17c4
>> #6 [ffffbfe30d68fed8] worker_thread at ffffffffa76a19dd
>> #7 [ffffbfe30d68ff10] kthread at ffffffffa76a7fd3
>> #8 [ffffbfe30d68ff50] ret_from_fork at ffffffffa80001ff
>>
>>
>>
>> struct -x mutex.owner.counter rtnl_mutex
>> owner.counter = 0xffff9a1072b50001
>>
>> 0xffff9a1072b50001 & (~0x07) = 0xffff9a1072b50000
>>
>> This points back to PID 8 / CPU 24. It is working on cleanup_net, and a part
>> of cleanup net involves calling ops_exit_list, and as part of that it calls
>> default_device_exit_batch. default_device_exit_batch takes the rtnl lock before
>> calling into unregister_netdevice_many, and rollback_registered_many.
>> rollback_registered_many calls flush_all_backlogs. This will never complete
>> because it is holding the rtnl lock, and PID 1369493 / CPU 58 is waiting
>> for rtnl_lock.
>
> But PID 1369493 is waiting on a mutex, thus properly yielding the cpu.
> (schedule() is clearly shown)
>
> This should not prevent other threads
> from making progress so that flush_all_backlogs() completes eventually.
>
> flush_all_backlogs() does not care of how many threads are currently blocked
> because they can not grab rtnl while flush_all_backlogs() is running.
>
>>
>> If relevant, the workqueue stalls themselves look something like:
>> BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=0 stuck for 3720s!
>> BUG: workqueue lockup - pool cpus=70 node=0 flags=0x0 nice=-20 stuck for 3719s!
>> Showing busy workqueues and worker pools:
>> workqueue events: flags=0x0
>> pwq 32: cpus=16 node=0 flags=0x0 nice=0 active=2/256
>> in-flight: 1274779:slab_caches_to_rcu_destroy_workfn slab_caches_to_rcu_destroy_workfn
>> workqueue events_highpri: flags=0x10
>> pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
>> pending: flush_backlog BAR(8)
>> workqueue events_power_efficient: flags=0x82
>> pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
>> in-flight: 1396446:check_lifetime
>> workqueue mm_percpu_wq: flags=0x8
>> pwq 140: cpus=70 node=0 flags=0x0 nice=0 active=1/256
>> pending: vmstat_update
>> workqueue netns: flags=0xe000a
>> pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
>> in-flight: 8:cleanup_net
>> delayed: cleanup_net
>> workqueue writeback: flags=0x4e
>> pwq 193: cpus=0-23,48-71 node=0 flags=0x4 nice=0 active=1/256
>> in-flight: 1334335:wb_workfn
>> workqueue kblockd: flags=0x18
>> pwq 141: cpus=70 node=0 flags=0x0 nice=-20 active=1/256
>> pending: blk_mq_run_work_fn
>> workqueue ipv6_addrconf: flags=0x40008
>> pwq 116: cpus=58 node=0 flags=0x0 nice=0 active=1/1
>> in-flight: 1369493:addrconf_verify_work
>> workqueue ena: flags=0xe000a
>> pwq 192: cpus=0-95 flags=0x4 nice=0 active=1/1
>> in-flight: 7505:ena_fw_reset_device [ena]
>>
Can you test the following :
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index cb493e15959c4d1bb68cf30f4099a8daa785bb84..bcc7ce03f13881415f64c7329559c7ed1e6321f3 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -4410,8 +4410,6 @@ static void addrconf_verify_rtnl(void)
now = jiffies;
next = round_jiffies_up(now + ADDR_CHECK_FREQUENCY);
- cancel_delayed_work(&addr_chk_work);
-
for (i = 0; i < IN6_ADDR_HSIZE; i++) {
restart:
hlist_for_each_entry_rcu_bh(ifp, &inet6_addr_lst[i], addr_lst) {
Hello guys,
At the beginning of the New Year, I also encountered this issue.
Hi Sargun,
Did you finally resolve this issue? As it was passed ~3 years, hope you still remember something about this case, thanks.
Hi Eric,
Below is my log, please let me give some feedback about your previous comments. Thanks.
Jan 1 00:06:30 kernel: [109121.968881] 000: perf: interrupt took too long (3914 > 3912), lowering kernel.perf_event_max_sample_rate to 51000
Jan 2 00:00:06 kernel: [195138.235171] 026: audit: type=1400 audit(1672588806.418:41): apparmor="DENIED" operation="capable" profile="/usr/sbin/cups-browsed" pid=6221 comm="cups-browsed" capability=23 capname="sys_nice"
Jan 2 00:04:08 kernel: [195380.604772] 027: INFO: task kworker/u56:2:6079 blocked for more than 122 seconds.
Jan 2 00:04:08 kernel: [195380.604776] 027: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:04:08 kernel: [195380.604777] 027: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:04:08 kernel: [195380.604779] 027: kworker/u56:2 D 0 6079 2 0x80084000
Jan 2 00:04:08 kernel: [195380.604789] 027: Workqueue: netns cleanup_net
Jan 2 00:04:08 kernel: [195380.604790] 027: Call Trace:
Jan 2 00:04:08 kernel: [195380.604793] 027: __schedule+0x3d4/0x8a0
Jan 2 00:04:08 kernel: [195380.604799] 027: ? __switch_to_asm+0x34/0x70
Jan 2 00:04:08 kernel: [195380.604801] 027: schedule+0x49/0x100
Jan 2 00:04:08 kernel: [195380.604804] 027: schedule_timeout+0x1ed/0x3b0
Jan 2 00:04:08 kernel: [195380.604807] 027: wait_for_completion+0x86/0xe0
Jan 2 00:04:08 kernel: [195380.604810] 027: __flush_work+0x121/0x1d0
Jan 2 00:04:08 kernel: [195380.604814] 027: ? flush_workqueue_prep_pwqs+0x140/0x140
Jan 2 00:04:08 kernel: [195380.604817] 027: flush_work+0x10/0x20
Jan 2 00:04:08 kernel: [195380.604819] 027: rollback_registered_many+0x1b2/0x530
Jan 2 00:04:08 kernel: [195380.604824] 027: unregister_netdevice_many.part.0+0x12/0x90
Jan 2 00:04:08 kernel: [195380.604826] 027: default_device_exit_batch+0x15c/0x190
Jan 2 00:04:08 kernel: [195380.604828] 027: ? do_wait_intr_irq+0x90/0x90
Jan 2 00:04:08 kernel: [195380.604832] 027: ops_exit_list.isra.0+0x61/0x70
Jan 2 00:04:08 kernel: [195380.604835] 027: cleanup_net+0x269/0x3a0
Jan 2 00:04:08 kernel: [195380.604837] 027: process_one_work+0x1c8/0x470
Jan 2 00:04:08 kernel: [195380.604840] 027: worker_thread+0x4a/0x3d0
Jan 2 00:04:08 kernel: [195380.604843] 027: kthread+0x133/0x180
Jan 2 00:04:08 kernel: [195380.604846] 027: ? process_one_work+0x470/0x470
Jan 2 00:04:08 kernel: [195380.604848] 027: ? kthread_park+0x90/0x90
Jan 2 00:04:08 kernel: [195380.604850] 027: ret_from_fork+0x35/0x40
Jan 2 00:06:11 kernel: [195503.484781] 000: INFO: task kworker/u56:2:6079 blocked for more than 245 seconds.
Jan 2 00:06:11 kernel: [195503.484784] 000: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:06:11 kernel: [195503.484786] 000: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:06:11 kernel: [195503.484788] 000: kworker/u56:2 D 0 6079 2 0x80084000
Jan 2 00:06:11 kernel: [195503.484797] 000: Workqueue: netns cleanup_net
Jan 2 00:06:11 kernel: [195503.484798] 000: Call Trace:
Jan 2 00:06:11 kernel: [195503.484802] 000: __schedule+0x3d4/0x8a0
Jan 2 00:06:11 kernel: [195503.484806] 000: ? __switch_to_asm+0x34/0x70
Jan 2 00:06:11 kernel: [195503.484809] 000: schedule+0x49/0x100
Jan 2 00:06:11 kernel: [195503.484811] 000: schedule_timeout+0x1ed/0x3b0
Jan 2 00:06:11 kernel: [195503.484815] 000: wait_for_completion+0x86/0xe0
Jan 2 00:06:11 kernel: [195503.484818] 000: __flush_work+0x121/0x1d0
Jan 2 00:06:11 kernel: [195503.484822] 000: ? flush_workqueue_prep_pwqs+0x140/0x140
Jan 2 00:06:11 kernel: [195503.484825] 000: flush_work+0x10/0x20
Jan 2 00:06:11 kernel: [195503.484827] 000: rollback_registered_many+0x1b2/0x530
Jan 2 00:06:11 kernel: [195503.484832] 000: unregister_netdevice_many.part.0+0x12/0x90
Jan 2 00:06:11 kernel: [195503.484834] 000: default_device_exit_batch+0x15c/0x190
Jan 2 00:06:11 kernel: [195503.484837] 000: ? do_wait_intr_irq+0x90/0x90
Jan 2 00:06:11 kernel: [195503.484840] 000: ops_exit_list.isra.0+0x61/0x70
Jan 2 00:06:11 kernel: [195503.484843] 000: cleanup_net+0x269/0x3a0
Jan 2 00:06:11 kernel: [195503.484846] 000: process_one_work+0x1c8/0x470
Jan 2 00:06:11 kernel: [195503.484849] 000: worker_thread+0x4a/0x3d0
Jan 2 00:06:11 kernel: [195503.484852] 000: kthread+0x133/0x180
Jan 2 00:06:11 kernel: [195503.484854] 000: ? process_one_work+0x470/0x470
Jan 2 00:06:11 kernel: [195503.484856] 000: ? kthread_park+0x90/0x90
Jan 2 00:06:11 kernel: [195503.484858] 000: ret_from_fork+0x35/0x40
Jan 2 00:06:11 kernel: [195503.484863] 000: INFO: task kworker/26:0:6200 blocked for more than 122 seconds.
Jan 2 00:06:11 kernel: [195503.484864] 000: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:06:11 kernel: [195503.484865] 000: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:06:11 kernel: [195503.484866] 000: kworker/26:0 D 0 6200 2 0x80084000
Jan 2 00:06:11 kernel: [195503.484873] 000: Workqueue: ipv6_addrconf addrconf_verify_work
Jan 2 00:06:11 kernel: [195503.484874] 000: Call Trace:
Jan 2 00:06:11 kernel: [195503.484875] 000: __schedule+0x3d4/0x8a0
Jan 2 00:06:11 kernel: [195503.484877] 000: schedule+0x49/0x100
Jan 2 00:06:11 kernel: [195503.484879] 000: __rt_mutex_slowlock+0x8a/0x150
Jan 2 00:06:11 kernel: [195503.484882] 000: rt_mutex_slowlock_locked+0xbb/0x280
Jan 2 00:06:11 kernel: [195503.484884] 000: ? __switch_to_asm+0x40/0x70
Jan 2 00:06:11 kernel: [195503.484886] 000: rt_mutex_slowlock+0x76/0xc0
Jan 2 00:06:11 kernel: [195503.484889] 000: __rt_mutex_lock_state+0x75/0x90
Jan 2 00:06:11 kernel: [195503.484891] 000: _mutex_lock+0x13/0x20
Jan 2 00:06:11 kernel: [195503.484894] 000: rtnl_lock+0x15/0x20
Jan 2 00:06:11 kernel: [195503.484897] 000: addrconf_verify_work+0xe/0x20
Jan 2 00:06:11 kernel: [195503.484899] 000: process_one_work+0x1c8/0x470
Jan 2 00:06:11 kernel: [195503.484902] 000: worker_thread+0x4a/0x3d0
Jan 2 00:06:11 kernel: [195503.484905] 000: kthread+0x133/0x180
Jan 2 00:06:11 kernel: [195503.484906] 000: ? process_one_work+0x470/0x470
Jan 2 00:06:11 kernel: [195503.484908] 000: ? kthread_park+0x90/0x90
Jan 2 00:06:11 kernel: [195503.484910] 000: ret_from_fork+0x35/0x40
Jan 2 00:08:14 kernel: [195626.364781] 027: INFO: task kworker/u56:2:6079 blocked for more than 368 seconds.
Jan 2 00:08:14 kernel: [195626.364785] 027: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:08:14 kernel: [195626.364786] 027: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:08:14 kernel: [195626.364788] 027: kworker/u56:2 D 0 6079 2 0x80084000
Jan 2 00:08:14 kernel: [195626.364797] 027: Workqueue: netns cleanup_net
Jan 2 00:08:14 kernel: [195626.364798] 027: Call Trace:
Jan 2 00:08:14 kernel: [195626.364802] 027: __schedule+0x3d4/0x8a0
Jan 2 00:08:14 kernel: [195626.364807] 027: ? __switch_to_asm+0x34/0x70
Jan 2 00:08:14 kernel: [195626.364810] 027: schedule+0x49/0x100
Jan 2 00:08:14 kernel: [195626.364812] 027: schedule_timeout+0x1ed/0x3b0
Jan 2 00:08:14 kernel: [195626.364816] 027: wait_for_completion+0x86/0xe0
Jan 2 00:08:14 kernel: [195626.364818] 027: __flush_work+0x121/0x1d0
Jan 2 00:08:14 kernel: [195626.364822] 027: ? flush_workqueue_prep_pwqs+0x140/0x140
Jan 2 00:08:14 kernel: [195626.364825] 027: flush_work+0x10/0x20
Jan 2 00:08:14 kernel: [195626.364827] 027: rollback_registered_many+0x1b2/0x530
Jan 2 00:08:14 kernel: [195626.364832] 027: unregister_netdevice_many.part.0+0x12/0x90
Jan 2 00:08:14 kernel: [195626.364835] 027: default_device_exit_batch+0x15c/0x190
Jan 2 00:08:14 kernel: [195626.364837] 027: ? do_wait_intr_irq+0x90/0x90
Jan 2 00:08:14 kernel: [195626.364841] 027: ops_exit_list.isra.0+0x61/0x70
Jan 2 00:08:14 kernel: [195626.364843] 027: cleanup_net+0x269/0x3a0
Jan 2 00:08:14 kernel: [195626.364846] 027: process_one_work+0x1c8/0x470
Jan 2 00:08:14 kernel: [195626.364849] 027: worker_thread+0x4a/0x3d0
Jan 2 00:08:14 kernel: [195626.364852] 027: kthread+0x133/0x180
Jan 2 00:08:14 kernel: [195626.364855] 027: ? process_one_work+0x470/0x470
Jan 2 00:08:14 kernel: [195626.364857] 027: ? kthread_park+0x90/0x90
Jan 2 00:08:14 kernel: [195626.364859] 027: ret_from_fork+0x35/0x40
Jan 2 00:08:14 kernel: [195626.364863] 027: INFO: task kworker/26:0:6200 blocked for more than 245 seconds.
Jan 2 00:08:14 kernel: [195626.364865] 027: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:08:14 kernel: [195626.364866] 027: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:08:14 kernel: [195626.364867] 027: kworker/26:0 D 0 6200 2 0x80084000
Jan 2 00:08:14 kernel: [195626.364873] 027: Workqueue: ipv6_addrconf addrconf_verify_work
Jan 2 00:08:14 kernel: [195626.364874] 027: Call Trace:
Jan 2 00:08:14 kernel: [195626.364875] 027: __schedule+0x3d4/0x8a0
Jan 2 00:08:14 kernel: [195626.364878] 027: schedule+0x49/0x100
Jan 2 00:08:14 kernel: [195626.364880] 027: __rt_mutex_slowlock+0x8a/0x150
Jan 2 00:08:14 kernel: [195626.364882] 027: rt_mutex_slowlock_locked+0xbb/0x280
Jan 2 00:08:14 kernel: [195626.364885] 027: ? __switch_to_asm+0x40/0x70
Jan 2 00:08:14 kernel: [195626.364886] 027: rt_mutex_slowlock+0x76/0xc0
Jan 2 00:08:14 kernel: [195626.364889] 027: __rt_mutex_lock_state+0x75/0x90
Jan 2 00:08:14 kernel: [195626.364892] 027: _mutex_lock+0x13/0x20
Jan 2 00:08:14 kernel: [195626.364894] 027: rtnl_lock+0x15/0x20
Jan 2 00:08:14 kernel: [195626.364898] 027: addrconf_verify_work+0xe/0x20
Jan 2 00:08:14 kernel: [195626.364900] 027: process_one_work+0x1c8/0x470
Jan 2 00:08:14 kernel: [195626.364902] 027: worker_thread+0x4a/0x3d0
Jan 2 00:08:14 kernel: [195626.364905] 027: kthread+0x133/0x180
Jan 2 00:08:14 kernel: [195626.364907] 027: ? process_one_work+0x470/0x470
Jan 2 00:08:14 kernel: [195626.364909] 027: ? kthread_park+0x90/0x90
Jan 2 00:08:14 kernel: [195626.364911] 027: ret_from_fork+0x35/0x40
................
Jan 2 00:14:23 kernel: [195995.004768] 000: INFO: task kworker/u56:2:6079 blocked for more than 737 seconds.
Jan 2 00:14:23 kernel: [195995.004771] 000: Tainted: G OE 5.4.161-rt67-rc1 #1
Jan 2 00:14:23 kernel: [195995.004772] 000: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 2 00:14:23 kernel: [195995.004774] 000: kworker/u56:2 D 0 6079 2 0x80084000
Jan 2 00:14:23 kernel: [195995.004783] 000: Workqueue: netns cleanup_net
Jan 2 00:14:23 kernel: [195995.004784] 000: Call Trace:
Jan 2 00:14:23 kernel: [195995.004788] 000: __schedule+0x3d4/0x8a0
Jan 2 00:14:23 kernel: [195995.004793] 000: ? __switch_to_asm+0x34/0x70
Jan 2 00:14:23 kernel: [195995.004795] 000: schedule+0x49/0x100
Jan 2 00:14:23 kernel: [195995.004797] 000: schedule_timeout+0x1ed/0x3b0
Jan 2 00:14:23 kernel: [195995.004801] 000: wait_for_completion+0x86/0xe0
Jan 2 00:14:23 kernel: [195995.004804] 000: __flush_work+0x121/0x1d0
Jan 2 00:14:23 kernel: [195995.004807] 000: ? flush_workqueue_prep_pwqs+0x140/0x140
Jan 2 00:14:23 kernel: [195995.004810] 000: flush_work+0x10/0x20
Jan 2 00:14:23 kernel: [195995.004813] 000: rollback_registered_many+0x1b2/0x530
Jan 2 00:14:23 kernel: [195995.004817] 000: unregister_netdevice_many.part.0+0x12/0x90
Jan 2 00:14:23 kernel: [195995.004819] 000: default_device_exit_batch+0x15c/0x190
Jan 2 00:14:23 kernel: [195995.004822] 000: ? do_wait_intr_irq+0x90/0x90
Jan 2 00:14:23 kernel: [195995.004825] 000: ops_exit_list.isra.0+0x61/0x70
Jan 2 00:14:23 kernel: [195995.004828] 000: cleanup_net+0x269/0x3a0
Jan 2 00:14:23 kernel: [195995.004831] 000: process_one_work+0x1c8/0x470
Jan 2 00:14:23 kernel: [195995.004834] 000: worker_thread+0x4a/0x3d0
Jan 2 00:14:23 kernel: [195995.004837] 000: kthread+0x133/0x180
Jan 2 00:14:23 kernel: [195995.004839] 000: ? process_one_work+0x470/0x470
Jan 2 00:14:23 kernel: [195995.004841] 000: ? kthread_park+0x90/0x90
Jan 2 00:14:23 kernel: [195995.004843] 000: ret_from_fork+0x35/0x40
After "task kworker/u56:2:6079 blocked for more than 737 seconds.",
the network seems down, both ssh, ifconfig can't work.
>> Sure, PID 1369493 addrconf_verify_work() is waiting for RTNL.
>>
>> But PID 8 ?
>>
>> __flush_work() is being called.
>>
>> But from where ? Stacks seem not complete.
__flush_work is just from the work queue, so, Yes,
we don't know who put it into the queue. And don't know whether the clean netns work is reasonable.
Maybe I need to add a trace at the put_net.
put_net
__put_net queue_work(netns_wq, &net_cleanup_work);
DECLARE_WORK(net_cleanup_work, cleanup_net);
cleanup_net
ops_exit_list
default_device_exit_batch
unregister_netdevice_many
rollback_registered_many
flush_work
__flush_work
>> But PID 1369493 is waiting on a mutex, thus properly yielding the cpu.
>> (schedule() is clearly shown)
>>
>> This should not prevent other threads
>> from making progress so that flush_all_backlogs() completes eventually.
>>
>> flush_all_backlogs() does not care of how many threads are currently blocked
>> because they can not grab rtnl while flush_all_backlogs() is running.
In my log, we can see even the schedule is shown but the task is still in the D status.
It seems the "addrconf_verify_work" is blocked by "cleanup_net".
But why the "cleanup_net" is blocked?
Currently, my plan is trying to add trace/print at the "__flush_work" to see: what work is blocked,
then research this work is owned by who.
As an expert, if you can give some advice, it will be very grateful.
BTW, this is a RT kernel, I'm also checking the system workload.
BR,
Dongyang