Greeting,
FYI, we noticed the following commit (built with gcc-9):
commit: dbbee9d5cd83f9d0a29639e260516907ceb2ac3d ("mm/page_alloc: convert per-cpu list protection to local_lock")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
in testcase: rcuscale
version:
with following parameters:
runtime: 300s
scale_type: rcu
on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+----------------------------------------------------------------------+------------+------------+
| | 28f836b677 | dbbee9d5cd |
+----------------------------------------------------------------------+------------+------------+
| WARNING:possible_circular_locking_dependency_detected | 0 | 19 |
+----------------------------------------------------------------------+------------+------------+
If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>
[ 35.813536]
[ 35.822731] ======================================================
[ 35.839288] WARNING: possible circular locking dependency detected
[ 35.849787] 5.13.0+ #1 Tainted: G W
[ 35.857793] ------------------------------------------------------
[ 35.861768] (sd-executor)/287 is trying to acquire lock:
[ 35.865616] ffff88839f9ef980 (lock#2){..-.}-{2:2}, at: rmqueue+0x6c4/0x2d10
[ 35.869948]
[ 35.869948] but task is already holding lock:
[ 35.876789] ffff888137841c30 (&anon_vma->rwsem){+.+.}-{3:3}, at: anon_vma_clone+0x11d/0x3f0
[ 35.881384]
[ 35.881384] which lock already depends on the new lock.
[ 35.881384]
[ 35.892102]
[ 35.892102] the existing dependency chain (in reverse order) is:
[ 35.899548]
[ 35.899548] -> #3 (&anon_vma->rwsem){+.+.}-{3:3}:
[ 35.907351] lock_acquire+0x1b7/0x510
[ 35.911221] down_write+0x87/0x380
[ 35.915154] __vma_adjust+0x595/0x15b0
[ 35.919148] __split_vma+0x35d/0x440
[ 35.923201] mprotect_fixup+0x533/0x6b0
[ 35.927158] __x64_sys_mprotect+0x351/0x650
[ 35.931339] do_syscall_64+0x5b/0x70
[ 35.935466] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 35.939399]
[ 35.939399] -> #2 (&mapping->i_mmap_rwsem){+.+.}-{3:3}:
[ 35.946885] lock_acquire+0x1b7/0x510
[ 35.951057] down_write+0x87/0x380
[ 35.955103] dma_resv_lockdep+0x2c0/0x444
[ 35.958970] do_one_initcall+0xf4/0x510
[ 35.962892] kernel_init_freeable+0x5cc/0x63a
[ 35.966862] kernel_init+0x8/0x103
[ 35.970642] ret_from_fork+0x1f/0x30
[ 35.974408]
[ 35.974408] -> #1 (fs_reclaim){+.+.}-{0:0}:
[ 35.981075] lock_acquire+0x1b7/0x510
[ 35.984700] fs_reclaim_acquire+0x117/0x160
[ 35.988429] __alloc_pages+0x137/0x650
[ 35.991994] stack_depot_save+0x390/0x4c0
[ 35.995374] save_stack+0x107/0x170
[ 35.998858] __set_page_owner+0x37/0x280
[ 36.002331] __alloc_pages_bulk+0xa34/0x1360
[ 36.005764] __vmalloc_area_node+0x180/0x600
[ 36.009644] __vmalloc_node_range+0xac/0x150
[ 36.013167] __vmalloc+0x63/0x90
[ 36.016439] pcpu_create_chunk+0x110/0x7e0
[ 36.019953] __pcpu_balance_workfn+0xcc5/0xff0
[ 36.023344] process_one_work+0x831/0x13f0
[ 36.026729] worker_thread+0x8c/0xca0
[ 36.030104] kthread+0x324/0x430
[ 36.033196] ret_from_fork+0x1f/0x30
[ 36.036355]
[ 36.036355] -> #0 (lock#2){..-.}-{2:2}:
[ 36.041975] check_prev_add+0x162/0x2500
[ 36.045033] __lock_acquire+0x24e9/0x3680
[ 36.048146] lock_acquire+0x1b7/0x510
[ 36.051324] rmqueue+0x6f9/0x2d10
[ 36.054141] get_page_from_freelist+0x165/0x970
[ 36.057131] __alloc_pages+0x251/0x650
[ 36.060107] allocate_slab+0x26d/0x320
[ 36.063169] ___slab_alloc+0x202/0x640
[ 36.066388] __slab_alloc+0x60/0x80
[ 36.069474] kmem_cache_alloc+0x677/0x730
[ 36.072417] anon_vma_clone+0xc6/0x3f0
[ 36.075370] anon_vma_fork+0x78/0x420
[ 36.078164] dup_mmap+0x707/0xb30
[ 36.080887] dup_mm+0x57/0x340
[ 36.083656] copy_process+0x3ca3/0x54d0
[ 36.086569] kernel_clone+0xa4/0xc00
[ 36.089250] __do_sys_clone+0x9e/0xc0
[ 36.091955] do_syscall_64+0x5b/0x70
[ 36.094761] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 36.097842]
[ 36.097842] other info that might help us debug this:
[ 36.097842]
[ 36.105726] Chain exists of:
[ 36.105726] lock#2 --> &mapping->i_mmap_rwsem --> &anon_vma->rwsem
[ 36.105726]
[ 36.114140] Possible unsafe locking scenario:
[ 36.114140]
[ 36.119328] CPU0 CPU1
[ 36.122223] ---- ----
[ 36.125212] lock(&anon_vma->rwsem);
[ 36.128035] lock(&mapping->i_mmap_rwsem);
[ 36.131241] lock(&anon_vma->rwsem);
[ 36.134520] lock(lock#2);
[ 36.137155]
[ 36.137155] *** DEADLOCK ***
[ 36.137155]
[ 36.144603] 3 locks held by (sd-executor)/287:
[ 36.147545] #0: ffff8881001bc028 (&mm->mmap_lock#2){++++}-{3:3}, at: dup_mmap+0x94/0xb30
[ 36.151198] #1: ffff888133ba2b28 (&mm->mmap_lock/1){+.+.}-{3:3}, at: dup_mmap+0xc1/0xb30
[ 36.154990] #2: ffff888137841c30 (&anon_vma->rwsem){+.+.}-{3:3}, at: anon_vma_clone+0x11d/0x3f0
[ 36.158706]
[ 36.158706] stack backtrace:
[ 36.163710] CPU: 1 PID: 287 Comm: (sd-executor) Tainted: G W 5.13.0+ #1
[ 36.167347] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 36.171201] Call Trace:
[ 36.174212] dump_stack_lvl+0xa2/0xe5
[ 36.177324] check_noncircular+0x23f/0x2d0
[ 36.180407] ? print_circular_bug+0x470/0x470
[ 36.183782] ? __thaw_task+0x70/0x70
[ 36.186717] ? mark_lock+0xc3/0x1460
[ 36.189813] ? deref_stack_reg+0x33/0x70
[ 36.192954] check_prev_add+0x162/0x2500
[ 36.196092] ? kvm_sched_clock_read+0x14/0x30
[ 36.199143] ? sched_clock_cpu+0x18/0x190
[ 36.202241] ? unwind_next_frame+0x39d/0x1760
[ 36.205460] __lock_acquire+0x24e9/0x3680
[ 36.208635] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
[ 36.211899] lock_acquire+0x1b7/0x510
[ 36.215004] ? rmqueue+0x6c4/0x2d10
[ 36.218321] ? stack_trace_snprint+0xd0/0xd0
[ 36.221563] ? rcu_read_unlock+0x40/0x40
[ 36.224737] ? deref_stack_reg+0x33/0x70
[ 36.228072] ? unwind_next_frame+0x10d7/0x1760
[ 36.231196] ? get_reg+0xef/0x170
[ 36.234288] rmqueue+0x6f9/0x2d10
[ 36.237434] ? rmqueue+0x6c4/0x2d10
[ 36.240441] ? deref_stack_reg+0x70/0x70
[ 36.243726] ? is_module_text_address+0xc/0x20
[ 36.247010] ? __kernel_text_address+0x9/0x30
[ 36.250044] ? unwind_get_return_address+0x5a/0xa0
[ 36.253422] ? __thaw_task+0x70/0x70
[ 36.256644] ? arch_stack_walk+0x71/0xb0
[ 36.259823] ? rmqueue_bulk+0x1e90/0x1e90
[ 36.263072] get_page_from_freelist+0x165/0x970
[ 36.266278] __alloc_pages+0x251/0x650
[ 36.269501] ? __alloc_pages_slowpath+0x2020/0x2020
[ 36.273045] allocate_slab+0x26d/0x320
[ 36.276355] ___slab_alloc+0x202/0x640
[ 36.279786] ? find_held_lock+0x33/0x110
[ 36.282919] ? anon_vma_clone+0xc6/0x3f0
[ 36.286173] ? lock_contended+0xbf0/0xbf0
[ 36.289163] ? anon_vma_clone+0xc6/0x3f0
[ 36.292356] __slab_alloc+0x60/0x80
[ 36.295876] ? anon_vma_clone+0xc6/0x3f0
[ 36.299164] kmem_cache_alloc+0x677/0x730
[ 36.302414] ? anon_vma_chain_link+0x8c/0x180
[ 36.305545] anon_vma_clone+0xc6/0x3f0
[ 36.308617] anon_vma_fork+0x78/0x420
[ 36.311618] dup_mmap+0x707/0xb30
[ 36.314743] ? vm_area_dup+0x2a0/0x2a0
[ 36.317945] dup_mm+0x57/0x340
[ 36.320830] copy_process+0x3ca3/0x54d0
[ 36.323948] ? __cleanup_sighand+0x60/0x60
[ 36.326864] kernel_clone+0xa4/0xc00
[ 36.329764] ? create_io_thread+0xb0/0xb0
[ 36.332847] ? rcu_read_lock_sched_held+0x7c/0xb0
[ 36.336123] ? rcu_read_lock_bh_held+0x90/0x90
[ 36.339079] ? fpregs_assert_state_consistent+0x18/0x90
[ 36.342318] ? lockdep_hardirqs_on_prepare+0x273/0x3e0
[ 36.345381] __do_sys_clone+0x9e/0xc0
[ 36.348295] ? kernel_clone+0xc00/0xc00
[ 36.351371] ? syscall_enter_from_user_mode+0x18/0x50
[ 36.356448] ? syscall_enter_from_user_mode+0x1d/0x50
[ 36.359499] do_syscall_64+0x5b/0x70
[ 36.362474] entry_SYSCALL_64_after_hwframe+0x44/0xae
[ 36.365573] RIP: 0033:0x7f19a2e537be
[ 36.368416] Code: db 0f 85 25 01 00 00 64 4c 8b 0c 25 10 00 00 00 45 31 c0 4d 8d 91 d0 02 00 00 31 d2 31 f6 bf 11 00 20 01 b8 38 00 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 b6 00 00 00 41 89 c4 85 c0 0f 85 c3 00 00
[ 36.376479] RSP: 002b:00007ffcf4722e60 EFLAGS: 00000246 ORIG_RAX: 0000000000000038
[ 36.380491] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f19a2e537be
[ 36.384372] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000001200011
[ 36.388173] RBP: 0000000000000000 R08: 0000000000000000 R09: 00007f19a1cc6940
[ 36.391976] R10: 00007f19a1cc6c10 R11: 0000000000000246 R12: 0000000000000020
[ 36.395860] R13: 00007f19a2cbc41c R14: 00007ffcf4723028 R15: 0000000000000001
[ 36.651662] (sd-executor) (287) used greatest stack depth: 26488 bytes left
[ 36.771729] random: systemd: uninitialized urandom read (16 bytes read)
To reproduce:
# build kernel
cd linux
cp config-5.13.0+ .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage
git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email
---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation
Thanks,
Oliver Sang
On Tue, Jul 13, 2021 at 10:40:57PM +0800, kernel test robot wrote:
>
>
> Greeting,
>
> FYI, we noticed the following commit (built with gcc-9):
>
> commit: dbbee9d5cd83f9d0a29639e260516907ceb2ac3d ("mm/page_alloc: convert per-cpu list protection to local_lock")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
Thanks, this will ultimately be fixed by https://lore.kernel.org/r/[email protected]
--
Mel Gorman
SUSE Labs