2021-01-26 10:55:42

by Xing Zhengjun

[permalink] [raw]
Subject: Re: [workqueue] d5bff968ea: WARNING:at_kernel/workqueue.c:#process_one_work



On 1/25/2021 5:29 PM, Hillf Danton wrote:
> On 25 Jan 2021 16:31:32 +0800 Xing Zhengjun wrote:
>> On 1/22/2021 3:59 PM, Hillf Danton wrote:
>>> On Fri, 22 Jan 2021 09:48:32 +0800 Xing Zhengjun wrote:
>>>> On 1/21/2021 12:00 PM, Hillf Danton wrote:
>>>>> On Wed, 20 Jan 2021 21:46:33 +0800 Oliver Sang wrote:
>>>>>> On Fri, Jan 15, 2021 at 03:24:32PM +0800, Hillf Danton wrote:
>>>>>>> Thu, 14 Jan 2021 15:45:11 +0800
>>>>>>>>
>>>>>>>> FYI, we noticed the following commit (built with gcc-9):
>>>>>>>>
>>>>>>>> commit: d5bff968ea9cc005e632d9369c26cbd8148c93d5 ("workqueue: break affinity initiatively")
>>>>>>>> https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git dev.2021.01.11b
>>>>>>>>
>>>>>>> [...]
>>>>>>>>
>>>>>>>> [ 73.794288] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2192 process_one_work
>>>>>>>
>>>>>>> Thanks for your report.
>>>>>>>
>>>>>>> We can also break CPU affinity by checking POOL_DISASSOCIATED at attach
>>>>>>> time without extra cost paid; that way we have the same behavior as at
>>>>>>> the unbind time.
>>>>>>>
>>>>>>> What is more the change that makes kworker pcpu is cut because they are
>>>>>>> going to not help either hotplug or the mechanism of stop machine.
>>>>>>
>>>>>> hi, by applying below patch, the issue still happened.
>>>>>
>>>>> Thanks for your report.
>>>>>>
>>>>>> [ 4.574467] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
>>>>>> [ 4.575651] pci 0000:00:01.0: Activating ISA DMA hang workarounds
>>>>>> [ 4.576900] pci 0000:00:02.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff]
>>>>>> [ 4.578648] PCI: CLS 0 bytes, default 64
>>>>>> [ 4.579685] Unpacking initramfs...
>>>>>> [ 8.878031] -----------[ cut here ]-----------
>>>>>> [ 8.879083] WARNING: CPU: 0 PID: 22 at kernel/workqueue.c:2187 process_one_work+0x92/0x9e0
>>>>>> [ 8.880688] Modules linked in:
>>>>>> [ 8.881274] CPU: 0 PID: 22 Comm: kworker/1:0 Not tainted 5.11.0-rc3-gc213503139bb #2
>>>>>
>>>>> The kworker bond to CPU1 runs on CPU0 and triggers the warning, which
>>>>> shows that scheduler breaks CPU affinity, after 06249738a41a
>>>>> ("workqueue: Manually break affinity on hotplug"), though quite likely
>>>>> by kworker/1:0 for the initial workers.
>>>>>
>>>>>> [ 8.882518] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>>>> [ 8.887539] Workqueue: 0x0 (events)
>>>>>> [ 8.887838] EIP: process_one_work+0x92/0x9e0
>>>>>> [ 8.887838] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 74 85 13 00 ff 05 b8 30 04 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>>>> [ 8.887838] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>>>> [ 8.887838] ESI: 43c04720 EDI: 42e45620 EBP: de7f23c0 ESP: 43d7bf08
>>>>>> [ 8.887838] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>>>> [ 8.887838] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
>>>>>> [ 8.887838] Call Trace:
>>>>>> [ 8.887838] ? worker_thread+0x98/0x6a0
>>>>>> [ 8.887838] ? worker_thread+0x2dd/0x6a0
>>>>>> [ 8.887838] ? kthread+0x1ba/0x1e0
>>>>>> [ 8.887838] ? create_worker+0x1e0/0x1e0
>>>>>> [ 8.887838] ? kzalloc+0x20/0x20
>>>>>> [ 8.887838] ? ret_from_fork+0x1c/0x28
>>>>>> [ 8.887838] _warn_unseeded_randomness: 63 callbacks suppressed
>>>>>> [ 8.887838] random: get_random_bytes called from init_oops_id+0x2b/0x60 with crng_init=0
>>>>>> [ 8.887838] --[ end trace ac461b4d54c37cfa ]--
>>>>>
>>>>>
>>>>> Instead of creating the initial workers only on the active CPUS, rebind
>>>>> them (labeled pcpu) and jump to the right CPU at bootup time.
>>>>>
>>>>> --- a/kernel/workqueue.c
>>>>> +++ b/kernel/workqueue.c
>>>>> @@ -2385,6 +2385,16 @@ woke_up:
>>>>> return 0;
>>>>> }
>>>>>
>>>>> + if (!(pool->flags & POOL_DISASSOCIATED) && smp_processor_id() !=
>>>>> + pool->cpu) {
>>>>> + /* scheduler breaks CPU affinity for us, rebind it */
>>>>> + raw_spin_unlock_irq(&pool->lock);
>>>>> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>>>> + /* and jump to the right seat */
>>>>> + schedule_timeout_interruptible(1);
>>>>> + goto woke_up;
>>>>> + }
>>>>> +
>>>>> worker_leave_idle(worker);
>>>>> recheck:
>>>>> /* no more worker necessary? */
>>>>> --
>>>>>
>>>> I test the patch, the warning still appears in the kernel log.
>>>
>>> Thanks for your report.
>>>>
>>>> [ 230.356503] smpboot: CPU 1 is now offline
>>>> [ 230.544652] x86: Booting SMP configuration:
>>>> [ 230.545077] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [ 230.545640] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>> [ 230.545675] masked ExtINT on CPU#1
>>>> [ 230.593829] ------------[ cut here ]------------
>>>> [ 230.594257] WARNING: CPU: 0 PID: 257 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>>>> [ 230.594990] Modules linked in: rcutorture torture mousedev input_leds
>>>> led_class pcspkr psmouse evbug tiny_power_button button
>>>> [ 230.595961] CPU: 0 PID: 257 Comm: kworker/1:3 Not tainted 5.11.0-rc3-gdcba55d9080f #2
>>>
>>> Like what was reported, kworker bond to CPU1 runs on CPU0 and triggers
>>> warning, due to scheduler breaking CPU affinity for us. What is new, the
>>> affinity was broken at offline time instead of bootup.
>>>
>>>> [ 230.596621] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>>>> [ 230.597322] Workqueue: 0x0 (rcu_gp)
>>>> [ 230.597636] EIP: process_one_work+0x92/0x9e0
>>>> [ 230.598005] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>>>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 f4 85 13 00 ff 05 cc 30 04
>>>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>>>> [ 230.599569] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>>>> [ 230.600100] ESI: 43d94240 EDI: df4040f4 EBP: de7f23c0 ESP: bf5f1f08
>>>> [ 230.600629] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>>>> [ 230.601203] CR0: 80050033 CR2: 01bdecbc CR3: 04e2c000 CR4: 000406d0
>>>> [ 230.601735] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>>>> [ 230.602265] DR6: fffe0ff0 DR7: 00000400
>>>> [ 230.602594] Call Trace:
>>>> [ 230.602813] ? process_one_work+0x20e/0x9e0
>>>> [ 230.603181] ? worker_thread+0x32d/0x700
>>>> [ 230.603522] ? kthread+0x1ba/0x1e0
>>>> [ 230.603818] ? create_worker+0x1e0/0x1e0
>>>> [ 230.604157] ? kzalloc+0x20/0x20
>>>> [ 230.604524] ? ret_from_fork+0x1c/0x28
>>>> [ 230.604850] ---[ end trace 06b1e66b5e17fa85 ]---
>>>> [ 230.605504] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>>>> [ 230.766960] smpboot: CPU 1 is now offline
>>>> [ 230.814803] x86: Booting SMP configuration:
>>>> [ 230.815306] smpboot: Booting Node 0 Processor 1 APIC 0x1
>>>> [ 230.815964] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>>>
>>>
>>> Unlike the above diff that is at most papering over the problem
>>> sitting somewhere in the scheduler, add change to creating worker
>>> by skipping set_cpus_allowed_ptr() because we will wake it up after
>>> attaching it to worker pool.
>>>
>>> If we can ignore rescuer for now, then the allowed ptr is only
>>> updated at on/offline time; lets see the difference at boot time.
>>>
>>>
>>> --- a/kernel/workqueue.c
>>> +++ b/kernel/workqueue.c
>>> @@ -1844,16 +1844,10 @@ static struct worker *alloc_worker(int n
>>> * cpu-[un]hotplugs.
>>> */
>>> static void worker_attach_to_pool(struct worker *worker,
>>> - struct worker_pool *pool)
>>> + struct worker_pool *pool,
>>> + int update_cpus_allowed)
>>> {
>>> mutex_lock(&wq_pool_attach_mutex);
>>> -
>>> - /*
>>> - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
>>> - * online CPUs. It'll be re-applied when any of the CPUs come up.
>>> - */
>>> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> -
>>> /*
>>> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
>>> * stable across this function. See the comments above the flag
>>> @@ -1867,6 +1861,9 @@ static void worker_attach_to_pool(struct
>>> list_add_tail(&worker->node, &pool->workers);
>>> worker->pool = pool;
>>>
>>> + if (update_cpus_allowed)
>>> + set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
>>> +
>>> mutex_unlock(&wq_pool_attach_mutex);
>>> }
>>>
>>> @@ -1942,8 +1939,11 @@ static struct worker *create_worker(stru
>>> set_user_nice(worker->task, pool->attrs->nice);
>>> kthread_bind_mask(worker->task, pool->attrs->cpumask);
>>>
>>> - /* successful, attach the worker to the pool */
>>> - worker_attach_to_pool(worker, pool);
>>> + /*
>>> + * attach the worker to the pool without asking scheduler to
>>> + * update CPUs allowed
>>> + */
>>> + worker_attach_to_pool(worker, pool, 0);
>>>
>>> /* start the newly created worker */
>>> raw_spin_lock_irq(&pool->lock);
>>> @@ -2508,7 +2508,7 @@ repeat:
>>>
>>> raw_spin_unlock_irq(&wq_mayday_lock);
>>>
>>> - worker_attach_to_pool(rescuer, pool);
>>> + worker_attach_to_pool(rescuer, pool, 1);
>>>
>>> raw_spin_lock_irq(&pool->lock);
>>>
>>> --
>>>
>> I test the patch, the warning still appears in the kernel log.
>
> Thanks.
>>
>> [ 55.754187] smpboot: Booting Node 0 Processor 1 APIC 0x1
>> [ 55.785594] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
>> [ 55.785646] masked ExtINT on CPU#1
>> [ 55.920602] ------------[ cut here ]------------
>> [ 55.921355] WARNING: CPU: 0 PID: 160 at kernel/workqueue.c:2192 process_one_work+0x92/0x9e0
>> [ 55.922583] Modules linked in: rcutorture torture mousedev evbug
>> input_leds led_class tiny_power_button psmouse pcspkr button
>> [ 55.924294] CPU: 0 PID: 160 Comm: kworker/1:2 Not tainted 5.11.0-rc3-00186-g77bf4e461cfa #2
>
> Same issue as before.
>
>> [ 55.925552] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
>> [ 55.926763] Workqueue: 0x0 (rcu_gp)
>> [ 55.927298] EIP: process_one_work+0x92/0x9e0
>> [ 55.927950] Code: 37 64 a1 58 54 4c 43 39 45 24 74 2c 31 c9 ba 01 00
>> 00 00 c7 04 24 01 00 00 00 b8 08 1d f5 42 e8 94 85 13 00 ff 05 b8 30 04
>> 43 <0f> 0b ba 01 00 00 00 eb 22 8d 74 26 00 90 c7 04 24 01 00 00 00 31
>> [ 55.930726] EAX: 42f51d08 EBX: 00000000 ECX: 00000000 EDX: 00000001
>> [ 55.931642] ESI: 43d90540 EDI: df48c0f4 EBP: de7f23c0 ESP: bfb47f08
>> [ 55.932590] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010002
>> [ 55.933609] CR0: 80050033 CR2: 024e994c CR3: 7fd80000 CR4: 000406d0
>> [ 55.934555] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
>> [ 55.935457] DR6: fffe0ff0 DR7: 00000400
>> [ 55.936041] Call Trace:
>> [ 55.936534] ? process_one_work+0x20e/0x9e0
>> [ 55.937305] ? worker_thread+0x2dd/0x6a0
>> [ 55.938018] ? kthread+0x1ba/0x1e0
>> [ 55.938598] ? create_worker+0x1e0/0x1e0
>> [ 55.939315] ? kzalloc+0x20/0x20
>> [ 55.940000] ? ret_from_fork+0x1c/0x28
>> [ 55.940627] ---[ end trace d155e9e6402de179 ]---
>> [ 55.941641] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
>> [ 56.155271] smpboot: CPU 1 is now offline
>> [ 56.193613] x86: Booting SMP configuration:
>> [ 56.194400] smpboot: Booting Node 0 Processor 1 APIC 0x1
>
> The changes in the diff below are
>
> 1/ at rescue time, change CPU affinity only if POOL_DISASSOCIATED
> is not set, and print warning the same way as offline time.
>
> 2/ at offine time, dont update allowed CPUs after setting
> POOL_DISASSOCIATED because we no longer have interest in affinity.
>
> 3/ at online time, mark pcpu before binding affinity.
>
> Though one change a diff is appreciated, by the WARNs, we can tell
> which is what if any warning goes into dmesg.
>
> --- a/kernel/workqueue.c
> +++ b/kernel/workqueue.c
> @@ -1844,25 +1844,23 @@ static struct worker *alloc_worker(int n
> * cpu-[un]hotplugs.
> */
> static void worker_attach_to_pool(struct worker *worker,
> - struct worker_pool *pool)
> + struct worker_pool *pool, int set)
> {
> mutex_lock(&wq_pool_attach_mutex);
>
> /*
> - * set_cpus_allowed_ptr() will fail if the cpumask doesn't have any
> - * online CPUs. It'll be re-applied when any of the CPUs come up.
> - */
> - set_cpus_allowed_ptr(worker->task, pool->attrs->cpumask);
> -
> - /*
> * The wq_pool_attach_mutex ensures %POOL_DISASSOCIATED remains
> * stable across this function. See the comments above the flag
> * definition for details.
> */
> if (pool->flags & POOL_DISASSOCIATED)
> worker->flags |= WORKER_UNBOUND;
> - else
> + else {
> kthread_set_per_cpu(worker->task, true);
> + if (set)
> + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
> + pool->attrs->cpumask) < 0);
> + }
>
> list_add_tail(&worker->node, &pool->workers);
> worker->pool = pool;
> @@ -1943,7 +1941,7 @@ static struct worker *create_worker(stru
> kthread_bind_mask(worker->task, pool->attrs->cpumask);
>
> /* successful, attach the worker to the pool */
> - worker_attach_to_pool(worker, pool);
> + worker_attach_to_pool(worker, pool, 0);
>
> /* start the newly created worker */
> raw_spin_lock_irq(&pool->lock);
> @@ -2508,7 +2506,7 @@ repeat:
>
> raw_spin_unlock_irq(&wq_mayday_lock);
>
> - worker_attach_to_pool(rescuer, pool);
> + worker_attach_to_pool(rescuer, pool, 1);
>
> raw_spin_lock_irq(&pool->lock);
>
> @@ -4923,7 +4921,6 @@ static void unbind_workers(int cpu)
>
> for_each_pool_worker(worker, pool) {
> kthread_set_per_cpu(worker->task, false);
> - WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, cpu_possible_mask) < 0);
> }
>
> mutex_unlock(&wq_pool_attach_mutex);
> @@ -4977,9 +4974,9 @@ static void rebind_workers(struct worker
> * from CPU_ONLINE, the following shouldn't fail.
> */
> for_each_pool_worker(worker, pool) {
> + kthread_set_per_cpu(worker->task, true);
> WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task,
> pool->attrs->cpumask) < 0);
> - kthread_set_per_cpu(worker->task, true);
> }
>
> raw_spin_lock_irq(&pool->lock);
> --
>
I test the patch, the warning still appears in the kernel log, but the
warning is different from before.

[ 0.054803] kvm-clock: cpu 1, msr 34f6021, secondary cpu clock
[ 0.054803] masked ExtINT on CPU#1
[ 0.054803] smpboot: CPU 1 Converting physical 0 to logical die 1
[ 1.890338] ------------[ cut here ]------------
[ 1.890338] WARNING: CPU: 1 PID: 18 at kernel/kthread.c:508
kthread_set_per_cpu+0x156/0x180
[ 1.890338] Modules linked in:
[ 1.890338] CPU: 1 PID: 18 Comm: cpuhp/1 Not tainted
5.11.0-rc3-00186-ged03082352b2 #2
[ 1.890338] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS 1.12.0-1 04/01/2014
[ 1.890338] EIP: kthread_set_per_cpu+0x156/0x180
[ 1.890338] Code: 00 00 00 00 ff 05 68 4e 04 43 83 c4 08 5b 5e 5f c3
8d 76 00 ff 05 34 50 04 43 0f 0b e9 f9 fe ff ff 8d 76 00 ff 05 2c 4e 04
43 <0f> 0b eb 9d 8d b6 00 00 00 00 ff 05 40 4e 04 43 0f 0b e9 45 ff ff
[ 1.890338] EAX: 42f52ce0 EBX: 00000001 ECX: 00000000 EDX: 00000001
[ 1.890338] ESI: 43d76300 EDI: 43c0de00 EBP: de7f2564 ESP: 43d6beb8
[ 1.900350] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010202
[ 1.901303] CR0: 80050033 CR2: 00000000 CR3: 034e3000 CR4: 000406d0
[ 1.902280] Call Trace:
[ 1.902682] ? workqueue_online_cpu+0x12b/0x640
[ 1.903415] ? workqueue_prepare_cpu+0xa0/0xa0
[ 1.904155] ? cpuhp_invoke_callback+0x1ed/0x1340
[ 1.904941] ? cpuhp_thread_fun+0x28f/0x460
[ 1.905630] ? cpuhp_thread_fun+0x49/0x460
[ 1.906298] ? smpboot_thread_fn+0x446/0x620
[ 1.910275] ? kthread+0x1ba/0x1e0
[ 1.910857] ? __smpboot_create_thread+0x260/0x260
[ 1.911659] ? kzalloc+0x20/0x20
[ 1.912368] ? ret_from_fork+0x1c/0x28
[ 1.913016] ---[ end trace 6f6c005278241eba ]---
[ 1.913971] kvm-guest: stealtime: cpu 1, msr 9e7e6ec0
[ 1.920012] smp: Brought up 1 node, 2 CPUs
[ 1.920299] smpboot: Max logical packages: 2
[ 1.921019] smpboot: Total of 2 processors activated (10774.03 BogoMIPS)



--
Zhengjun Xing


Attachments:
dmesg.xz (37.69 kB)