2021-09-12 13:11:01

by kernel test robot

[permalink] [raw]
Subject: [mctp] 831119f887: net/mctp/route.c:#RCU-list_traversed_in_non-reader_section



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 831119f8878173adbf31f1151adf0f4627c05e01 ("mctp: Add neighbour netlink interface")
https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master


in testcase: trinity
version: trinity-x86_64-da65f0aa-1_20210719
with following parameters:

number: 99999
group: group-00

test-description: Trinity is a linux system call fuzz tester.
test-url: http://codemonkey.org.uk/projects/trinity/


on test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


+-------------------------------------------------------------------------+------------+------------+
| | 4d8b931928 | 831119f887 |
+-------------------------------------------------------------------------+------------+------------+
| WARNING:at_kernel/locking/mutex.c:#__mutex_lock | 33 | |
| RIP:__mutex_lock | 33 | |
| net/mctp/route.c:#RCU-list_traversed_in_non-reader_section | 0 | 34 |
+-------------------------------------------------------------------------+------------+------------+
above net/mctp/route.c:#RCU-list_traversed_in_non-reader_section
is as below [1]. dmesg is attached as dmesg.xz

please be noted we also observed above _mutex_lock issue is quite
persistent on parent but clear on 831119f887 as [2].
also attached dmesg as dmesg-parent.xz


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[1]
[ 1185.024190][ T4341]
[ 1185.045280][ T4341] =============================
[ 1185.065918][ T4341] WARNING: suspicious RCU usage
[ 1185.085966][ T4341] 5.14.0-rc2-00609-g831119f88781 #1 Not tainted
[ 1185.105870][ T4341] -----------------------------
[ 1185.125035][ T4341] net/mctp/route.c:539 RCU-list traversed in non-reader section!!
[ 1185.145449][ T4341]
[ 1185.145449][ T4341] other info that might help us debug this:
[ 1185.145449][ T4341]
[ 1185.203271][ T4341]
[ 1185.203271][ T4341] rcu_scheduler_active = 2, debug_locks = 1
[ 1185.238618][ T4341] 3 locks held by kworker/u4:0/4341:
[ 1185.256444][ T4341] #0: ffff97744d1a7148 ((wq_completion)netns){+.+.}-{0:0}, at: process_one_work+0x311/0x980
[ 1185.276938][ T4341] #1: ffffaa5bc2b1fe48 (net_cleanup_work){+.+.}-{0:0}, at: process_one_work+0x311/0x980
[ 1185.297723][ T4341] #2: ffffffff92abe518 (pernet_ops_rwsem){++++}-{3:3}, at: cleanup_net+0x4c/0x540
[ 1185.318190][ T4341]
[ 1185.318190][ T4341] stack backtrace:
[ 1185.353937][ T4341] CPU: 1 PID: 4341 Comm: kworker/u4:0 Not tainted 5.14.0-rc2-00609-g831119f88781 #1 da6e2240a2b7bfba76fb0db30bcbc709044affde
[ 1185.392818][ T4341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 1185.414149][ T4341] Workqueue: netns cleanup_net
[ 1185.434509][ T4341] Call Trace:
[ 1185.454281][ T4341] dump_stack_lvl+0xf9/0x169
[ 1185.474308][ T4341] mctp_routes_net_exit+0xb6/0xc0
[ 1185.495166][ T4341] ? mctp_route_release+0xc0/0xc0
[ 1185.518342][ T4341] ops_exit_list+0x51/0xc0
[ 1185.540222][ T4341] cleanup_net+0x317/0x540
[ 1185.560276][ T4341] process_one_work+0x3e7/0x980
[ 1185.579980][ T4341] worker_thread+0x5b/0x600
[ 1185.599574][ T4341] ? process_one_work+0x980/0x980
[ 1185.619318][ T4341] kthread+0x170/0x1c0
[ 1185.638173][ T4341] ? set_kthread_struct+0x80/0x80
[ 1185.657595][ T4341] ret_from_fork+0x22/0x30


[2]
[ 840.857965][ T269] ------------[ cut here ]------------
[ 840.873417][ T269] DEBUG_LOCKS_WARN_ON(lock->magic != lock)
[ 840.873558][ T269] WARNING: CPU: 1 PID: 269 at kernel/locking/mutex.c:941 __mutex_lock+0x8af/0x980
[ 840.904838][ T269] Modules linked in:
[ 840.919998][ T269] CPU: 1 PID: 269 Comm: kworker/u4:6 Not tainted 5.14.0-rc2-00608-g4d8b9319282a #1 f42656e0afb0e39d464fa0d5312671ab5011e5fa
[ 840.951471][ T269] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 840.968043][ T269] Workqueue: netns cleanup_net
[ 840.984063][ T269] RIP: 0010:__mutex_lock+0x8af/0x980
[ 841.000075][ T269] Code: 8b 8d 60 ff ff ff 48 8b 95 68 ff ff ff e9 be fe ff ff 90 0f 0b 90 48 c7 c6 ff 2e 70 b2 48 c7 c7 7d bc 6d b2 e8 c6 b4 82 ff 9
0 <0f> 0b 90 90 eb b0 90 48 c7 c6 13 2f 70 b2 48 c7 c7 7d bc 6d b2 e8
[ 841.034128][ T269] RSP: 0018:ffffbe748099fb50 EFLAGS: 00010282
[ 841.051128][ T269] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffffa798699f
[ 841.068286][ T269] RDX: 0000000000000000 RSI: ffff9bdddb289000 RDI: 0000000000000002
[ 841.085414][ T269] RBP: ffffbe748099fbf0 R08: 0000000000000000 R09: 0000000000000000
[ 841.102169][ T269] R10: 0000000000000000 R11: 000000002d2d2d2d R12: 0000000000000000
[ 841.118973][ T269] R13: ffff9bddf6b0f980 R14: ffff9bde782436f8 R15: 0000000000000001
[ 841.135800][ T269] FS: 0000000000000000(0000) GS:ffff9be0efa00000(0000) knlGS:0000000000000000
[ 841.152923][ T269] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 841.169905][ T269] CR2: 00007fd2ae5c14a8 CR3: 00000001b81b0000 CR4: 00000000000406e0
[ 841.187322][ T269] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 841.204963][ T269] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 841.222441][ T269] Call Trace:
[ 841.239255][ T269] ? mark_held_locks+0x50/0x80
[ 841.256099][ T269] ? mctp_neigh_remove_dev+0x3b/0x100
[ 841.273131][ T269] ? lockdep_hardirqs_on+0x77/0x100
[ 841.290042][ T269] ? __call_rcu+0x209/0x540
[ 841.306996][ T269] ? trace_hardirqs_on+0x63/0x300
[ 841.323871][ T269] ? mctp_neigh_remove_dev+0x3b/0x100
[ 841.340797][ T269] mctp_neigh_remove_dev+0x3b/0x100
[ 841.357494][ T269] mctp_dev_notify+0x6b/0x2c0
[ 841.373942][ T269] notifier_call_chain+0x6a/0x140
[ 841.390465][ T269] call_netdevice_notifiers_info+0x7d/0x100
[ 841.406880][ T269] unregister_netdevice_many+0x5a9/0xa80
[ 841.423341][ T269] default_device_exit_batch+0x1a7/0x200
[ 841.439721][ T269] ? autoremove_wake_function+0x80/0x80
[ 841.456079][ T269] ? unregister_netdev+0x40/0x40
[ 841.472355][ T269] ? __dev_change_net_namespace+0x6c0/0x6c0
[ 841.488706][ T269] ops_exit_list+0x7e/0xc0
[ 841.505058][ T269] cleanup_net+0x317/0x540
[ 841.521021][ T269] process_one_work+0x3e7/0x980
[ 841.536825][ T269] worker_thread+0x5b/0x600
[ 841.552164][ T269] ? process_one_work+0x980/0x980
[ 841.567138][ T269] kthread+0x170/0x1c0
[ 841.581814][ T269] ? set_kthread_struct+0x80/0x80
[ 841.596646][ T269] ret_from_fork+0x22/0x30
[ 841.611378][ T269] irq event stamp: 1124135
[ 841.625143][ T269] hardirqs last enabled at (1124135): [<ffffffffa77ea009>] __call_rcu+0x209/0x540
[ 841.639866][ T269] hardirqs last disabled at (1124134): [<ffffffffa77ea074>] __call_rcu+0x274/0x540
[ 841.654741][ T269] softirqs last enabled at (1124054): [<ffffffffae59ce55>] __neigh_ifdown+0xd5/0x180
[ 841.669453][ T269] softirqs last disabled at (1124052): [<ffffffffae59cdb6>] __neigh_ifdown+0x36/0x180
[ 841.683649][ T269] ---[ end trace bdf300e3a00a25a8 ]---



To reproduce:

# build kernel
cd linux
cp config-5.14.0-rc2-00609-g831119f88781 .config
make HOSTCC=gcc-9 CC=gcc-9 ARCH=x86_64 olddefconfig prepare modules_prepare bzImage

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp qemu -k <bzImage> job-script # job-script is attached in this email



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (7.76 kB)
config-5.14.0-rc2-00609-g831119f88781 (277.94 kB)
job-script (4.53 kB)
dmesg.xz (58.09 kB)
trinity (7.34 kB)
dmesg-parent.xz (57.51 kB)
Download all attachments