2023-05-24 08:08:13

by Naresh Kamboju

[permalink] [raw]
Subject: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

While running selftests: net: udpgso_bench.sh on qemu-x86_64 the following
kernel crash noticed on stable rc 6.3.4-rc2 kernel.

Reported-by: Linux Kernel Functional Testing <[email protected]>

Test run log:
=========

<12>[ 38.049122] kselftest: Running tests in net
TAP version 13
1..16
# selftests: net: udpgso_bench.sh
# ipv4
# tcp
# tcp tx: 230 MB/s 3905 calls/s 3905 msg/s
# tcp rx: 231 MB/s 3668 calls/s
# tcp tx: 225 MB/s 3817 calls/s 3817 msg/s
# tcp rx: 225 MB/s 3525 calls/s
# tcp tx: 225 MB/s 3820 calls/s 3820 msg/s
# tcp zerocopy
# tcp tx: 198 MB/s 3369 calls/s 3369 msg/s
# tcp rx: 197 MB/s 2855 calls/s
# tcp tx: 195 MB/s 3318 calls/s 3318 msg/s
# tcp rx: 197 MB/s 2845 calls/s
# udp
# udp rx: 8 MB/s 5811 calls/s
# udp tx: 11 MB/s 7938 calls/s 189 msg/s
# udp rx: 10 MB/s 7523 calls/s
# udp tx: 10 MB/s 7308 calls/s 174 msg/s
# udp rx: 10 MB/s 7338 calls/s
# udp gso
# udp rx: 19 MB/s 14080 calls/s
# udp tx: 118 MB/s 2012 calls/s 2012 msg/s
# udp rx: 26 MB/s 18688 calls/s
# udp tx: 117 MB/s 2000 calls/s 2000 msg/s
# udp rx: 26 MB/s 18688 calls/s
# udp tx: 118 MB/s 2008 calls/s 2008 msg/s
# udp gso zerocopy
# udp rx: 19 MB/s 13824 calls/s
# udp tx: 102 MB/s 1736 calls/s 1736 msg/s
# udp rx: 25 MB/s 18176 calls/s
# udp tx: 101 MB/s 1714 calls/s 1714 msg/s
# udp rx: 25 MB/s 18176 calls/s
# udp tx: 98 MB/s 1679 calls/s 1679 msg/s
# udp gso timestamp
# udp rx: 19 MB/s 13824 calls/s
# udp tx: 94 MB/s 1606 calls/s 1606 msg/s
# udp rx: 25 MB/s 18432 calls/s
# udp tx: 92 MB/s 1574 calls/s 1574 msg/s
# udp rx: 27 MB/s 19309 calls/s
# udp tx: 88 MB/s 1502 calls/s 1502 msg/s
# udp gso zerocopy audit
# udp rx: 19 MB/s 14080 calls/s
# udp tx: 101 MB/s 1728 calls/s 1728 msg/s
# udp rx: 25 MB/s 18432 calls/s
# udp tx: 100 MB/s 1699 calls/s 1699 msg/s
# udp rx: 26 MB/s 18688 calls/s
# udp tx: 101 MB/s 1724 calls/s 1724 msg/s
# Summary over 3.000 seconds...
# sum udp tx: 103 MB/s 5151 calls (1717/s) 5151 msgs (1717/s)
# Zerocopy acks: 5151
# udp gso timestamp audit
# udp rx: 19 MB/s 13843 calls/s
# udp tx: 92 MB/s 1571 calls/s 1571 msg/s
# udp rx: 26 MB/s 18568 calls/s
# udp tx: 95 MB/s 1614 calls/s 1614 msg/s
# udp rx: 26 MB/s 19200 calls/s
# udp tx: 93 MB/s 1589 calls/s 1589 msg/s
# Summary over 3.000 seconds...
# sum udp tx: 96 MB/s 4774 calls (1591/s) 4774 msgs (1591/s)
# Tx Timestamps: 4774 received 0 errors
# udp gso zerocopy timestamp audit
# udp rx: 18 MB/s 13312 calls/s
# udp tx: 76 MB/s 1297 calls/s 1297 msg/s
# udp rx: 26 MB/s 18524 calls/s
# udp tx: 74 MB/s 1269 calls/s 1269 msg/s
# udp rx: 25 MB/s 18176 calls/s
# udp tx: 75 MB/s 1289 calls/s 1289 msg/s
# Summary over 3.000 seconds...
# sum udp tx: 77 MB/s 3855 calls (1285/s) 3855 msgs (1285/s)
# Tx Timestamps: 3855 received 0 errors
# Zerocopy acks: 3855
# ipv6
# tcp
# tcp tx: 215 MB/s 3657 calls/s 3657 msg/s
# tcp rx: 216 MB/s 3431 calls/s
# tcp tx: 211 MB/s 3590 calls/s 3590 msg/s
# tcp rx: 211 MB/s 3319 calls/s
# tcp tx: 211 MB/s 3579 calls/s 3579 msg/s
# tcp zerocopy
# tcp tx: 191 MB/s 3245 calls/s 3245 msg/s
# tcp rx: 193 MB/s 2908 calls/s
# tcp tx: 184 MB/s 3135 calls/s 3135 msg/s
# tcp rx: 185 MB/s 2830 calls/s
# tcp tx: 191 MB/s 3254 calls/s 3254 msg/s
# udp
<4>[ 88.821235] int3: 0000 [#1] PREEMPT SMP PTI
<4>[ 88.821491] CPU: 1 PID: 561 Comm: udpgso_bench_tx Not tainted 6.3.4-rc2 #1
<4>[ 88.821576] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
BIOS 1.14.0-2 04/01/2014
<4>[ 88.821685] RIP: 0010:lookup_reuseport+0x4a/0x200
<4>[ 88.822122] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
8d 04 08
<4>[ 88.822175] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
<4>[ 88.822215] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
ffff966c02b66020
<4>[ 88.822228] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
ffff966c03ae2e80
<4>[ 88.822241] RBP: 00000000000093bf R08: 00000000000093bf R09:
ffffffffb0b2c8a0
<4>[ 88.822254] R10: 0000000042388386 R11: 00000000000093bf R12:
ffff966c03ae2e80
<4>[ 88.822266] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
ffffffffb0b2c8a0
<4>[ 88.822312] FS: 00007f4e6ede4740(0000)
GS:ffff966c7bd00000(0000) knlGS:0000000000000000
<4>[ 88.822330] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 88.822343] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
00000000000006e0
<4>[ 88.822438] Call Trace:
<4>[ 88.823080] <IRQ>
<4>[ 88.823274] udp6_lib_lookup2+0xf8/0x1c0
<4>[ 88.823368] __udp6_lib_lookup+0x113/0x3c0
<4>[ 88.823382] ? __wake_up_common_lock+0x79/0x190
<4>[ 88.823403] __udp6_lib_lookup_skb+0x76/0x90
<4>[ 88.823426] __udp6_lib_rcv+0x295/0x400
<4>[ 88.823440] ip6_protocol_deliver_rcu+0x34e/0x5c0
<4>[ 88.823483] ip6_input+0x60/0x110
<4>[ 88.823496] ? ip6_rcv_core+0x311/0x450
<4>[ 88.823509] ipv6_rcv+0x47/0xf0
<4>[ 88.823523] __netif_receive_skb+0x65/0x170
<4>[ 88.823539] process_backlog+0xd7/0x180
<4>[ 88.823553] __napi_poll+0x2c/0x1b0
<4>[ 88.823565] net_rx_action+0x178/0x2e0
<4>[ 88.823580] __do_softirq+0xc4/0x274
<4>[ 88.823595] do_softirq+0x7e/0xb0
<4>[ 88.823751] </IRQ>
<4>[ 88.823769] <TASK>
<4>[ 88.823773] __local_bh_enable_ip+0x6e/0x70
<4>[ 88.823786] ip6_finish_output2+0x3fc/0x560
<4>[ 88.823803] ip6_finish_output+0x1ab/0x320
<4>[ 88.823816] ip6_output+0x6b/0x130
<4>[ 88.823827] ? __pfx_ip6_finish_output+0x10/0x10
<4>[ 88.823839] ip6_send_skb+0x1e/0x80
<4>[ 88.823850] udp_v6_send_skb+0x26e/0x400
<4>[ 88.823865] udpv6_sendmsg+0xb33/0xc60
<4>[ 88.823879] ? __pfx_ip_generic_getfrag+0x10/0x10
<4>[ 88.823902] sock_sendmsg+0x42/0xa0
<4>[ 88.823915] __sys_sendto+0x281/0x2f0
<4>[ 88.823938] __x64_sys_sendto+0x21/0x30
<4>[ 88.823949] do_syscall_64+0x48/0xa0
<4>[ 88.823969] ? exit_to_user_mode_prepare+0x2a/0x80
<4>[ 88.823981] entry_SYSCALL_64_after_hwframe+0x72/0xdc
<4>[ 88.824104] RIP: 0033:0x7f4e6eef1973
<4>[ 88.824267] Code: 8b 15 91 74 0c 00 f7 d8 64 89 02 48 c7 c0 ff
ff ff ff eb b8 0f 1f 00 80 3d 71 fc 0c 00 00 41 89 ca 74 14 b8 2c 00
00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44
89 4c 24
<4>[ 88.824276] RSP: 002b:00007ffc3a3d79f8 EFLAGS: 00000202
ORIG_RAX: 000000000000002c
<4>[ 88.824293] RAX: ffffffffffffffda RBX: 00005596927cf110 RCX:
00007f4e6eef1973
<4>[ 88.824298] RDX: 00000000000005ac RSI: 00005596927cf110 RDI:
0000000000000005
<4>[ 88.824304] RBP: 0000000000000000 R08: 0000000000000000 R09:
0000000000000000
<4>[ 88.824309] R10: 0000000000000000 R11: 0000000000000202 R12:
0000000000000002
<4>[ 88.824313] R13: 0000000000000005 R14: 000000000000e628 R15:
00000000000005ac
<4>[ 88.824335] </TASK>
<4>[ 88.824377] Modules linked in: mptcp_diag tcp_diag inet_diag
ip_tables x_tables
<4>[ 88.845108] ---[ end trace 0000000000000000 ]---
<4>[ 88.845178] RIP: 0010:lookup_reuseport+0x4a/0x200
<4>[ 88.845216] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
8d 04 08
<4>[ 88.845232] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
<4>[ 88.845249] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
ffff966c02b66020
<4>[ 88.845257] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
ffff966c03ae2e80
<4>[ 88.845266] RBP: 00000000000093bf R08: 00000000000093bf R09:
ffffffffb0b2c8a0
<4>[ 88.845273] R10: 0000000042388386 R11: 00000000000093bf R12:
ffff966c03ae2e80
<4>[ 88.845281] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
ffffffffb0b2c8a0
<4>[ 88.845290] FS: 00007f4e6ede4740(0000)
GS:ffff966c7bd00000(0000) knlGS:0000000000000000
<4>[ 88.845302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
<4>[ 88.845311] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
00000000000006e0
<0>[ 88.845862] Kernel panic - not syncing: Fatal exception in interrupt
<0>[ 88.848258] Kernel Offset: 0x2e800000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)


log:
====
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2QCexh5uf81VW7HjLpuo5vu2LCe
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2QCeuW0pJ8XVzYeG3rpgza2cZDW/
- https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.3.y/build/v6.3.3-364-ga37c304c022d/testrun/17170111/suite/log-parser-test/tests/


--
Linaro LKFT
https://lkft.linaro.org


2023-05-27 03:30:46

by Jakub Kicinski

[permalink] [raw]
Subject: Re: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

On Wed, 24 May 2023 13:24:15 +0530 Naresh Kamboju wrote:
> While running selftests: net: udpgso_bench.sh on qemu-x86_64 the following
> kernel crash noticed on stable rc 6.3.4-rc2 kernel.

Can you repro this or it's just a one-off?

Adding some experts to CC.

> Test run log:
> =========
>
> <12>[ 38.049122] kselftest: Running tests in net
> TAP version 13
> 1..16
> # selftests: net: udpgso_bench.sh
> # ipv4
> # tcp
> # tcp tx: 230 MB/s 3905 calls/s 3905 msg/s
[...]
> # tcp tx: 191 MB/s 3254 calls/s 3254 msg/s
> # udp
> <4>[ 88.821235] int3: 0000 [#1] PREEMPT SMP PTI
> <4>[ 88.821491] CPU: 1 PID: 561 Comm: udpgso_bench_tx Not tainted 6.3.4-rc2 #1
> <4>[ 88.821576] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> BIOS 1.14.0-2 04/01/2014
> <4>[ 88.821685] RIP: 0010:lookup_reuseport+0x4a/0x200
> <4>[ 88.822122] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
> e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
> 5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
> 8d 04 08
> <4>[ 88.822175] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
> <4>[ 88.822215] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
> ffff966c02b66020
> <4>[ 88.822228] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
> ffff966c03ae2e80
> <4>[ 88.822241] RBP: 00000000000093bf R08: 00000000000093bf R09:
> ffffffffb0b2c8a0
> <4>[ 88.822254] R10: 0000000042388386 R11: 00000000000093bf R12:
> ffff966c03ae2e80
> <4>[ 88.822266] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
> ffffffffb0b2c8a0
> <4>[ 88.822312] FS: 00007f4e6ede4740(0000)
> GS:ffff966c7bd00000(0000) knlGS:0000000000000000
> <4>[ 88.822330] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[ 88.822343] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
> 00000000000006e0
> <4>[ 88.822438] Call Trace:
> <4>[ 88.823080] <IRQ>
> <4>[ 88.823274] udp6_lib_lookup2+0xf8/0x1c0
> <4>[ 88.823368] __udp6_lib_lookup+0x113/0x3c0
> <4>[ 88.823382] ? __wake_up_common_lock+0x79/0x190
> <4>[ 88.823403] __udp6_lib_lookup_skb+0x76/0x90
> <4>[ 88.823426] __udp6_lib_rcv+0x295/0x400
> <4>[ 88.823440] ip6_protocol_deliver_rcu+0x34e/0x5c0
> <4>[ 88.823483] ip6_input+0x60/0x110
> <4>[ 88.823496] ? ip6_rcv_core+0x311/0x450
> <4>[ 88.823509] ipv6_rcv+0x47/0xf0
> <4>[ 88.823523] __netif_receive_skb+0x65/0x170
> <4>[ 88.823539] process_backlog+0xd7/0x180
> <4>[ 88.823553] __napi_poll+0x2c/0x1b0
> <4>[ 88.823565] net_rx_action+0x178/0x2e0
> <4>[ 88.823580] __do_softirq+0xc4/0x274
> <4>[ 88.823595] do_softirq+0x7e/0xb0
> <4>[ 88.823751] </IRQ>
> <4>[ 88.823769] <TASK>
> <4>[ 88.823773] __local_bh_enable_ip+0x6e/0x70
> <4>[ 88.823786] ip6_finish_output2+0x3fc/0x560
> <4>[ 88.823803] ip6_finish_output+0x1ab/0x320
> <4>[ 88.823816] ip6_output+0x6b/0x130
> <4>[ 88.823827] ? __pfx_ip6_finish_output+0x10/0x10
> <4>[ 88.823839] ip6_send_skb+0x1e/0x80
> <4>[ 88.823850] udp_v6_send_skb+0x26e/0x400
> <4>[ 88.823865] udpv6_sendmsg+0xb33/0xc60
> <4>[ 88.823879] ? __pfx_ip_generic_getfrag+0x10/0x10
> <4>[ 88.823902] sock_sendmsg+0x42/0xa0
> <4>[ 88.823915] __sys_sendto+0x281/0x2f0
> <4>[ 88.823938] __x64_sys_sendto+0x21/0x30
> <4>[ 88.823949] do_syscall_64+0x48/0xa0
> <4>[ 88.823969] ? exit_to_user_mode_prepare+0x2a/0x80
> <4>[ 88.823981] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> <4>[ 88.824104] RIP: 0033:0x7f4e6eef1973
> <4>[ 88.824267] Code: 8b 15 91 74 0c 00 f7 d8 64 89 02 48 c7 c0 ff
> ff ff ff eb b8 0f 1f 00 80 3d 71 fc 0c 00 00 41 89 ca 74 14 b8 2c 00
> 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44
> 89 4c 24
> <4>[ 88.824276] RSP: 002b:00007ffc3a3d79f8 EFLAGS: 00000202
> ORIG_RAX: 000000000000002c
> <4>[ 88.824293] RAX: ffffffffffffffda RBX: 00005596927cf110 RCX:
> 00007f4e6eef1973
> <4>[ 88.824298] RDX: 00000000000005ac RSI: 00005596927cf110 RDI:
> 0000000000000005
> <4>[ 88.824304] RBP: 0000000000000000 R08: 0000000000000000 R09:
> 0000000000000000
> <4>[ 88.824309] R10: 0000000000000000 R11: 0000000000000202 R12:
> 0000000000000002
> <4>[ 88.824313] R13: 0000000000000005 R14: 000000000000e628 R15:
> 00000000000005ac
> <4>[ 88.824335] </TASK>
> <4>[ 88.824377] Modules linked in: mptcp_diag tcp_diag inet_diag
> ip_tables x_tables
> <4>[ 88.845108] ---[ end trace 0000000000000000 ]---
> <4>[ 88.845178] RIP: 0010:lookup_reuseport+0x4a/0x200
> <4>[ 88.845216] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
> e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
> 5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
> 8d 04 08
> <4>[ 88.845232] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
> <4>[ 88.845249] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
> ffff966c02b66020
> <4>[ 88.845257] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
> ffff966c03ae2e80
> <4>[ 88.845266] RBP: 00000000000093bf R08: 00000000000093bf R09:
> ffffffffb0b2c8a0
> <4>[ 88.845273] R10: 0000000042388386 R11: 00000000000093bf R12:
> ffff966c03ae2e80
> <4>[ 88.845281] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
> ffffffffb0b2c8a0
> <4>[ 88.845290] FS: 00007f4e6ede4740(0000)
> GS:ffff966c7bd00000(0000) knlGS:0000000000000000
> <4>[ 88.845302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> <4>[ 88.845311] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
> 00000000000006e0
> <0>[ 88.845862] Kernel panic - not syncing: Fatal exception in interrupt
> <0>[ 88.848258] Kernel Offset: 0x2e800000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)



2023-05-27 04:01:38

by Kuniyuki Iwashima

[permalink] [raw]
Subject: Re: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

From: Jakub Kicinski <[email protected]>
Date: Fri, 26 May 2023 20:16:07 -0700
> On Wed, 24 May 2023 13:24:15 +0530 Naresh Kamboju wrote:
> > While running selftests: net: udpgso_bench.sh on qemu-x86_64 the following
> > kernel crash noticed on stable rc 6.3.4-rc2 kernel.
>
> Can you repro this or it's just a one-off?
>
> Adding some experts to CC.

FWIW, I couldn't reproduce it on my x86_64 QEMU setup & 6.4.0-rc3
at least 5 times, so maybe one-off ?

---8<---
[root@localhost ~]# ./udpgso_bench.sh
...
udpgso_bench.sh: PASS=18 SKIP=0 FAIL=0
udpgso_bench.sh: PASS
---8<---

And it seems the vmlinux does not have debuginfo...
https://storage.tuxsuite.com/public/linaro/lkft/builds/2QCeuW0pJ8XVzYeG3rpgza2cZDW/

---8<---
$ echo lookup_reuseport+0x4a/0x200 | ../net-next/scripts/decode_stacktrace.sh vmlinux
lookup_reuseport (udp.c:?)
---8<---


>
> > Test run log:
> > =========
> >
> > <12>[ 38.049122] kselftest: Running tests in net
> > TAP version 13
> > 1..16
> > # selftests: net: udpgso_bench.sh
> > # ipv4
> > # tcp
> > # tcp tx: 230 MB/s 3905 calls/s 3905 msg/s
> [...]
> > # tcp tx: 191 MB/s 3254 calls/s 3254 msg/s
> > # udp
> > <4>[ 88.821235] int3: 0000 [#1] PREEMPT SMP PTI
> > <4>[ 88.821491] CPU: 1 PID: 561 Comm: udpgso_bench_tx Not tainted 6.3.4-rc2 #1
> > <4>[ 88.821576] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009),
> > BIOS 1.14.0-2 04/01/2014
> > <4>[ 88.821685] RIP: 0010:lookup_reuseport+0x4a/0x200
> > <4>[ 88.822122] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
> > e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
> > 5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
> > 8d 04 08
> > <4>[ 88.822175] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
> > <4>[ 88.822215] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
> > ffff966c02b66020
> > <4>[ 88.822228] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
> > ffff966c03ae2e80
> > <4>[ 88.822241] RBP: 00000000000093bf R08: 00000000000093bf R09:
> > ffffffffb0b2c8a0
> > <4>[ 88.822254] R10: 0000000042388386 R11: 00000000000093bf R12:
> > ffff966c03ae2e80
> > <4>[ 88.822266] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
> > ffffffffb0b2c8a0
> > <4>[ 88.822312] FS: 00007f4e6ede4740(0000)
> > GS:ffff966c7bd00000(0000) knlGS:0000000000000000
> > <4>[ 88.822330] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[ 88.822343] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
> > 00000000000006e0
> > <4>[ 88.822438] Call Trace:
> > <4>[ 88.823080] <IRQ>
> > <4>[ 88.823274] udp6_lib_lookup2+0xf8/0x1c0
> > <4>[ 88.823368] __udp6_lib_lookup+0x113/0x3c0
> > <4>[ 88.823382] ? __wake_up_common_lock+0x79/0x190
> > <4>[ 88.823403] __udp6_lib_lookup_skb+0x76/0x90
> > <4>[ 88.823426] __udp6_lib_rcv+0x295/0x400
> > <4>[ 88.823440] ip6_protocol_deliver_rcu+0x34e/0x5c0
> > <4>[ 88.823483] ip6_input+0x60/0x110
> > <4>[ 88.823496] ? ip6_rcv_core+0x311/0x450
> > <4>[ 88.823509] ipv6_rcv+0x47/0xf0
> > <4>[ 88.823523] __netif_receive_skb+0x65/0x170
> > <4>[ 88.823539] process_backlog+0xd7/0x180
> > <4>[ 88.823553] __napi_poll+0x2c/0x1b0
> > <4>[ 88.823565] net_rx_action+0x178/0x2e0
> > <4>[ 88.823580] __do_softirq+0xc4/0x274
> > <4>[ 88.823595] do_softirq+0x7e/0xb0
> > <4>[ 88.823751] </IRQ>
> > <4>[ 88.823769] <TASK>
> > <4>[ 88.823773] __local_bh_enable_ip+0x6e/0x70
> > <4>[ 88.823786] ip6_finish_output2+0x3fc/0x560
> > <4>[ 88.823803] ip6_finish_output+0x1ab/0x320
> > <4>[ 88.823816] ip6_output+0x6b/0x130
> > <4>[ 88.823827] ? __pfx_ip6_finish_output+0x10/0x10
> > <4>[ 88.823839] ip6_send_skb+0x1e/0x80
> > <4>[ 88.823850] udp_v6_send_skb+0x26e/0x400
> > <4>[ 88.823865] udpv6_sendmsg+0xb33/0xc60
> > <4>[ 88.823879] ? __pfx_ip_generic_getfrag+0x10/0x10
> > <4>[ 88.823902] sock_sendmsg+0x42/0xa0
> > <4>[ 88.823915] __sys_sendto+0x281/0x2f0
> > <4>[ 88.823938] __x64_sys_sendto+0x21/0x30
> > <4>[ 88.823949] do_syscall_64+0x48/0xa0
> > <4>[ 88.823969] ? exit_to_user_mode_prepare+0x2a/0x80
> > <4>[ 88.823981] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> > <4>[ 88.824104] RIP: 0033:0x7f4e6eef1973
> > <4>[ 88.824267] Code: 8b 15 91 74 0c 00 f7 d8 64 89 02 48 c7 c0 ff
> > ff ff ff eb b8 0f 1f 00 80 3d 71 fc 0c 00 00 41 89 ca 74 14 b8 2c 00
> > 00 00 0f 05 <48> 3d 00 f0 ff ff 77 75 c3 0f 1f 40 00 55 48 83 ec 30 44
> > 89 4c 24
> > <4>[ 88.824276] RSP: 002b:00007ffc3a3d79f8 EFLAGS: 00000202
> > ORIG_RAX: 000000000000002c
> > <4>[ 88.824293] RAX: ffffffffffffffda RBX: 00005596927cf110 RCX:
> > 00007f4e6eef1973
> > <4>[ 88.824298] RDX: 00000000000005ac RSI: 00005596927cf110 RDI:
> > 0000000000000005
> > <4>[ 88.824304] RBP: 0000000000000000 R08: 0000000000000000 R09:
> > 0000000000000000
> > <4>[ 88.824309] R10: 0000000000000000 R11: 0000000000000202 R12:
> > 0000000000000002
> > <4>[ 88.824313] R13: 0000000000000005 R14: 000000000000e628 R15:
> > 00000000000005ac
> > <4>[ 88.824335] </TASK>
> > <4>[ 88.824377] Modules linked in: mptcp_diag tcp_diag inet_diag
> > ip_tables x_tables
> > <4>[ 88.845108] ---[ end trace 0000000000000000 ]---
> > <4>[ 88.845178] RIP: 0010:lookup_reuseport+0x4a/0x200
> > <4>[ 88.845216] Code: 74 0b 49 89 f6 0f b6 46 12 3c 01 75 07 31 c0
> > e9 ed 00 00 00 4d 89 cf 44 89 c5 49 89 cd 49 89 fc 0f 1f 44 00 00 8b
> > 5c 24 50 0f <1f> 44 00 00 41 8b 45 04 41 33 45 00 8b 0d b0 c5 ed 00 44
> > 8d 04 08
> > <4>[ 88.845232] RSP: 0018:ffffa95c800c0b90 EFLAGS: 00000206
> > <4>[ 88.845249] RAX: 0000000000000007 RBX: 0000000000001f40 RCX:
> > ffff966c02b66020
> > <4>[ 88.845257] RDX: ffff966c01a9aa00 RSI: ffff966c02801500 RDI:
> > ffff966c03ae2e80
> > <4>[ 88.845266] RBP: 00000000000093bf R08: 00000000000093bf R09:
> > ffffffffb0b2c8a0
> > <4>[ 88.845273] R10: 0000000042388386 R11: 00000000000093bf R12:
> > ffff966c03ae2e80
> > <4>[ 88.845281] R13: ffff966c02b66020 R14: ffff966c02801500 R15:
> > ffffffffb0b2c8a0
> > <4>[ 88.845290] FS: 00007f4e6ede4740(0000)
> > GS:ffff966c7bd00000(0000) knlGS:0000000000000000
> > <4>[ 88.845302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > <4>[ 88.845311] CR2: 000055a1c0b90bf0 CR3: 0000000103b0e000 CR4:
> > 00000000000006e0
> > <0>[ 88.845862] Kernel panic - not syncing: Fatal exception in interrupt
> > <0>[ 88.848258] Kernel Offset: 0x2e800000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)

2023-05-27 09:41:07

by Arnd Bergmann

[permalink] [raw]
Subject: Re: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

On Sat, May 27, 2023, at 05:49, Kuniyuki Iwashima wrote:
> From: Jakub Kicinski <[email protected]>
> Date: Fri, 26 May 2023 20:16:07 -0700
>> On Wed, 24 May 2023 13:24:15 +0530 Naresh Kamboju wrote:
>> > While running selftests: net: udpgso_bench.sh on qemu-x86_64 the following
>> > kernel crash noticed on stable rc 6.3.4-rc2 kernel.
>>
>> Can you repro this or it's just a one-off?
>>
>> Adding some experts to CC.
>
> FWIW, I couldn't reproduce it on my x86_64 QEMU setup & 6.4.0-rc3
> at least 5 times, so maybe one-off ?

This looks like one of several spurious reports that lkft has produced
recently, where an 'int3' trap instruction is executed in a function
that is live-patched, but at a point where the int3 is not expected.

Anders managed to get a reproducer for one of these on his manchine
yesterday, and has narrowed it down to failing on qemu-7.2.2 but
not failing on qemu-8.0.

The current theory right now is that this is a qemu bug when
dealing with self-modifying x86 code that has been fixed in
qemu-8.0 already, and my suggestion would be to ignore all bugs
found by lkft that involve an 'int3' trap, and instead change
the lkft setup to use either qemu-8.0 or run the test systems
in kvm (which would also be much faster and save resources).

Someone still needs to get to the bottom of this bug to see
if it's in qemu or in the kernel livepatching code, but I'm
sure it has nothing to do with the ipv6 stack.

Arnd

2023-05-27 18:07:16

by Naresh Kamboju

[permalink] [raw]
Subject: Re: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

Hi Arnd,

On Sat, 27 May 2023 at 15:03, Arnd Bergmann <[email protected]> wrote:
>
> On Sat, May 27, 2023, at 05:49, Kuniyuki Iwashima wrote:
> > From: Jakub Kicinski <[email protected]>
> > Date: Fri, 26 May 2023 20:16:07 -0700
> >> On Wed, 24 May 2023 13:24:15 +0530 Naresh Kamboju wrote:
> >> > While running selftests: net: udpgso_bench.sh on qemu-x86_64 the following
> >> > kernel crash noticed on stable rc 6.3.4-rc2 kernel.
> >>
> >> Can you repro this or it's just a one-off?
> >>
> >> Adding some experts to CC.
> >
> > FWIW, I couldn't reproduce it on my x86_64 QEMU setup & 6.4.0-rc3
> > at least 5 times, so maybe one-off ?
>
> This looks like one of several spurious reports that lkft has produced
> recently, where an 'int3' trap instruction is executed in a function
> that is live-patched, but at a point where the int3 is not expected.
>
> Anders managed to get a reproducer for one of these on his manchine
> yesterday, and has narrowed it down to failing on qemu-7.2.2 but
> not failing on qemu-8.0.

This is an added advantage to tests on multiple qemu versions
and comparing the difference in test results.
Thanks, Anders.

>
> The current theory right now is that this is a qemu bug when
> dealing with self-modifying x86 code that has been fixed in
> qemu-8.0 already, and my suggestion would be to ignore all bugs
> found by lkft that involve an 'int3' trap, and instead change
> the lkft setup to use either qemu-8.0 or run the test systems
> in kvm (which would also be much faster and save resources).

I will send out an update to ignore the 'int3' trap email reports.

>
> Someone still needs to get to the bottom of this bug to see
> if it's in qemu or in the kernel livepatching code, but I'm
> sure it has nothing to do with the ipv6 stack.

Thank you Arnd.

- Naresh

> Arnd

2023-05-27 19:01:34

by Arnd Bergmann

[permalink] [raw]
Subject: Re: selftests: net: udpgso_bench.sh: RIP: 0010:lookup_reuseport

On Sat, May 27, 2023, at 20:02, Naresh Kamboju wrote:
> On Sat, 27 May 2023 at 15:03, Arnd Bergmann <[email protected]> wrote:
>> On Sat, May 27, 2023, at 05:49, Kuniyuki Iwashima wrote:
>> The current theory right now is that this is a qemu bug when
>> dealing with self-modifying x86 code that has been fixed in
>> qemu-8.0 already, and my suggestion would be to ignore all bugs
>> found by lkft that involve an 'int3' trap, and instead change
>> the lkft setup to use either qemu-8.0 or run the test systems
>> in kvm (which would also be much faster and save resources).
>
> I will send out an update to ignore the 'int3' trap email reports.

Just to clarify: what I meant was ignoring the old reports with
qemu-7.2 but not any new ones that come from qemu-8.0.

Arnd