Been getting beaten up by this bug for a few days now. I made a small
test program for you netfilter experts to try because I'm running out
of ideas over here. Attached is a C program to trigger the BUG_ON. I
have narrowed possible causes down to the portion of my code that
sends NFT_MSG_NEWRULE, if you comment that out the bug will not
happen. Let me know if you need more information or have a patch to
try.
The kernel config is nothing special, minimal x86 qemu with ipv{4,6}
and full nftables options, no modules.
------------[ cut here ]------------
kernel BUG at net/netfilter/nf_tables_api.c:816!
invalid opcode: 0000 [#1]
CPU: 0 PID: 42 Comm: kworker/u2:2 Not tainted 4.9.40 #1
Workqueue: netns cleanup_net
task: c0225540 task.stack: c026e000
EIP: 0060:[<c1289440>] EFLAGS: 00000202 CPU: 0
EIP is at nf_tables_table_destroy.isra.23.part.24+0x0/0x10
EAX: f4e613d8 EBX: f4e613d8 ECX: 000000a8 EDX: 00000001
ESI: f4e613d8 EDI: f4e613d8 EBP: f4e613e8 ESP: c026fe90
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
CR0: 80050033 CR2: 08197148 CR3: 002a5000 CR4: 00000690
Stack:
c128caa6 f4e613e8 f4e4a000 f4e52464 f4e52450 f4e52464 f4e4a000 f4e52450
f4e613d8 f4e61664 00000000 00000000 00000000 00000000 f4e4a000 c026ff08
c1470348 c147034c c133f31e f4e4a000 c124e506 c147033c c026fee8 c026ff10
Call Trace:
[<c128caa6>] ? nft_unregister_afinfo+0x1f6/0x200
[<c133f31e>] ? nf_tables_ipv6_exit_net+0xe/0x20
[<c124e506>] ? ops_exit_list.isra.6+0x26/0x50
[<c124edc5>] ? cleanup_net+0x135/0x210
[<c1044c1a>] ? pick_next_task_fair+0xba/0x120
[<c1038c7e>] ? process_one_work+0x19e/0x350
[<c1038e77>] ? worker_thread+0x47/0x4a0
[<c1038e30>] ? process_one_work+0x350/0x350
[<c103d248>] ? kthread+0x98/0xb0
[<c103d1b0>] ? kthread_worker_fn+0xb0/0xb0
[<c134c377>] ? ret_from_fork+0x1b/0x28
Code: 8b 04 24 8d 53 b4 8b 08 89 f8 e8 7c f5 fe ff 8b 5b 08 83 eb 08
39 de 75 c2 83 c4 04 5b 5e 5f 5d c3 8d 76 00 8d bc 27 00 00 00 00 <0f>
0b 8d b4 26 00 00 00 00 8d bc 27 00 00 00 00 8b 50 1c 85 d2
EIP: [<c1289440>] nf_tables_table_destroy.isra.23.part.24+0x0/0x10
SS:ESP 0068:c026fe90
---[ end trace 20fa171526d8ba2a ]---
BUG: unable to handle kernel paging request at fffffff0
IP: [<c103d4a6>] kthread_data+0x6/0x10
*pde = 014b5067 *pte = 00000000
Oops: 0000 [#2]
CPU: 0 PID: 42 Comm: kworker/u2:2 Tainted: G D 4.9.40 #1
task: c0225540 task.stack: c026e000
EIP: 0060:[<c103d4a6>] EFLAGS: 00000002 CPU: 0
EIP is at kthread_data+0x6/0x10
EAX: 00000000 EBX: c0225540 ECX: 00000001 EDX: c0225570
ESI: 00000000 EDI: c026ff98 EBP: c026ff80 ESP: c026ff64
DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 0068
CR0: 80050033 CR2: 00000014 CR3: 002a5000 CR4: 00000690
Stack:
c10387f5 c1349a27 c0225750 c026fdf0 c0225540 c026fdf0 c026ff98 c026ff88
c104210d 00000000 c102aee9 c02256cc 0126ff90 c026ff98 c026ff98 0000000b
c0270000 c13f9a10 00000000 c134cdfc 00000000 00000000 00000000 00000000
Call Trace:
[<c10387f5>] ? wq_worker_sleeping+0x5/0x70
[<c1349a27>] ? __schedule+0x207/0x350
[<c104210d>] ? do_task_dead+0x1d/0x20
[<c102aee9>] ? do_exit+0x4c9/0x7f0
[<c134cdfc>] ? rewind_stack_do_exit+0x10/0x12
Code: 27 00 00 00 00 85 c0 74 03 c6 00 00 a1 a8 11 45 c1 8b 80 e4 01
00 00 8b 40 e8 d1 e8 83 e0 01 c3 90 8d 74 26 00 8b 80 e4 01 00 00 <8b>
40 f0 c3 8d b6 00 00 00 00 83 ec 04 8b 90 e4 01 00 00 b9 04
EIP: [<c103d4a6>] kthread_data+0x6/0x10
SS:ESP 0068:c026ff64
CR2: 00000000fffffff0
---[ end trace 20fa171526d8ba2b ]---