2024-01-16 19:17:54

by Eric Dumazet

[permalink] [raw]
Subject: Re: Kernel panic in netif_rx_internal after v6 pings between netns

On Tue, Jan 16, 2024 at 7:36 PM Matthieu Baerts <[email protected]> wrote:
>
> Hello,
>
> Our MPTCP CIs recently hit some kernel panics when validating the -net
> tree + 2 pending MPTCP patches. This is on top of e327b2372bc0 ("net:
> ravb: Fix dma_addr_t truncation in error case").
>
> It looks like these panics are not related to MPTCP. That's why I'm
> sharing that here:

Indeed, this seems an x86 issue to me (jump labels ?), are all stack
traces pointing to the same issue ?

Let's cc lkml just in case this rings a bell

>
> > # INFO: validating network environment with pings
> > [ 45.505495] int3: 0000 [#1] PREEMPT SMP NOPTI
> > [ 45.505547] CPU: 1 PID: 1070 Comm: ping Tainted: G N 6.7.0-g244ee3389ffe #1
> > [ 45.505547] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
> > [ 45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
> > All code
> > ========
> > 0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
> > 7: 00
> > 8: 0f 1f 40 00 nopl 0x0(%rax)
> > c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > 11: 55 push %rbp
> > 12: 48 89 fd mov %rdi,%rbp
> > 15: 48 83 ec 20 sub $0x20,%rsp
> > 19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> > 20: 00 00
> > 22: 48 89 44 24 18 mov %rax,0x18(%rsp)
> > 27: 31 c0 xor %eax,%eax
> > 29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
> > 2e: 66 90 xchg %ax,%ax
> > 30: 66 90 xchg %ax,%ax
> > 32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
> > 37: 48 89 ef mov %rbp,%rdi
> > 3a: 65 gs
> > 3b: 8b .byte 0x8b
> > 3c: 35 .byte 0x35
> > 3d: 17 (bad)
> > 3e: 9d popf
> > 3f: 11 .byte 0x11
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: c9 leave
> > 1: 00 00 add %al,(%rax)
> > 3: 00 66 90 add %ah,-0x70(%rsi)
> > 6: 66 90 xchg %ax,%ax
> > 8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
> > d: 48 89 ef mov %rbp,%rdi
> > 10: 65 gs
> > 11: 8b .byte 0x8b
> > 12: 35 .byte 0x35
> > 13: 17 (bad)
> > 14: 9d popf
> > 15: 11 .byte 0x11
> > [ 45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
> > [ 45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
> > [ 45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
> > [ 45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
> > [ 45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
> > [ 45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
> > [ 45.505547] FS: 00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
> > [ 45.505547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
> > [ 45.505547] Call Trace:
> > [ 45.505547] <IRQ>
> > [ 45.505547] ? die (arch/x86/kernel/dumpstack.c:421)
> > [ 45.505547] ? exc_int3 (arch/x86/kernel/traps.c:762)
> > [ 45.505547] ? asm_exc_int3 (arch/x86/include/asm/idtentry.h:569)
> > [ 45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] ? netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] __netif_rx (net/core/dev.c:5084)
> > [ 45.505547] veth_xmit (drivers/net/veth.c:321)
> > [ 45.505547] dev_hard_start_xmit (include/linux/netdevice.h:4989)
> > [ 45.505547] __dev_queue_xmit (include/linux/netdevice.h:3367)
> > [ 45.505547] ? selinux_ip_postroute_compat (security/selinux/hooks.c:5783)
> > [ 45.505547] ? eth_header (net/ethernet/eth.c:85)
> > [ 45.505547] ip6_finish_output2 (include/net/neighbour.h:542)
> > [ 45.505547] ? ip6_output (include/linux/netfilter.h:301)
> > [ 45.505547] ? ip6_mtu (net/ipv6/route.c:3208)
> > [ 45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953)
> > [ 45.505547] icmpv6_echo_reply (net/ipv6/icmp.c:812)
> > [ 45.505547] ? icmpv6_rcv (net/ipv6/icmp.c:939)
> > [ 45.505547] icmpv6_rcv (net/ipv6/icmp.c:939)
> > [ 45.505547] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:440)
> > [ 45.505547] ip6_input_finish (include/linux/rcupdate.h:779)
> > [ 45.505547] __netif_receive_skb_one_core (net/core/dev.c:5537)
> > [ 45.505547] process_backlog (include/linux/rcupdate.h:779)
> > [ 45.505547] __napi_poll (net/core/dev.c:6576)
> > [ 45.505547] net_rx_action (net/core/dev.c:6647)
> > [ 45.505547] __do_softirq (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] do_softirq (kernel/softirq.c:454)
> > [ 45.505547] </IRQ>
> > [ 45.505547] <TASK>
> > [ 45.505547] __local_bh_enable_ip (kernel/softirq.c:381)
> > [ 45.505547] __dev_queue_xmit (net/core/dev.c:4379)
> > [ 45.505547] ip6_finish_output2 (include/linux/netdevice.h:3171)
> > [ 45.505547] ? ip6_output (include/linux/netfilter.h:301)
> > [ 45.505547] ? ip6_mtu (net/ipv6/route.c:3208)
> > [ 45.505547] ip6_send_skb (net/ipv6/ip6_output.c:1953)
> > [ 45.505547] rawv6_sendmsg (net/ipv6/raw.c:584)
> > [ 45.505547] ? netfs_clear_subrequests (include/linux/list.h:373)
> > [ 45.505547] ? netfs_alloc_request (fs/netfs/objects.c:42)
> > [ 45.505547] ? folio_add_file_rmap_ptes (arch/x86/include/asm/bitops.h:206)
> > [ 45.505547] ? set_pte_range (mm/memory.c:4529)
> > [ 45.505547] ? next_uptodate_folio (include/linux/xarray.h:1699)
> > [ 45.505547] ? __sock_sendmsg (net/socket.c:733)
> > [ 45.505547] __sock_sendmsg (net/socket.c:733)
> > [ 45.505547] ? move_addr_to_kernel.part.0 (net/socket.c:253)
> > [ 45.505547] __sys_sendto (net/socket.c:2191)
> > [ 45.505547] ? __hrtimer_run_queues (include/linux/seqlock.h:566)
> > [ 45.505547] ? __do_softirq (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] __x64_sys_sendto (net/socket.c:2203)
> > [ 45.505547] do_syscall_64 (arch/x86/entry/common.c:52)
> > [ 45.505547] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:129)
> > [ 45.505547] RIP: 0033:0x7fa1d099ca0a
> > [ 45.505547] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
> > All code
> > ========
> > 0: d8 64 89 02 fsubs 0x2(%rcx,%rcx,4)
> > 4: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
> > b: eb b8 jmp 0xffffffffffffffc5
> > d: 0f 1f 00 nopl (%rax)
> > 10: f3 0f 1e fa endbr64
> > 14: 41 89 ca mov %ecx,%r10d
> > 17: 64 8b 04 25 18 00 00 mov %fs:0x18,%eax
> > 1e: 00
> > 1f: 85 c0 test %eax,%eax
> > 21: 75 15 jne 0x38
> > 23: b8 2c 00 00 00 mov $0x2c,%eax
> > 28: 0f 05 syscall
> > 2a:* 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax <-- trapping instruction
> > 30: 77 7e ja 0xb0
> > 32: c3 ret
> > 33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > 38: 41 54 push %r12
> > 3a: 48 83 ec 30 sub $0x30,%rsp
> > 3e: 44 rex.R
> > 3f: 89 .byte 0x89
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 48 3d 00 f0 ff ff cmp $0xfffffffffffff000,%rax
> > 6: 77 7e ja 0x86
> > 8: c3 ret
> > 9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > e: 41 54 push %r12
> > 10: 48 83 ec 30 sub $0x30,%rsp
> > 14: 44 rex.R
> > 15: 89 .byte 0x89
> > [ 45.505547] RSP: 002b:00007ffe47710958 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
> > [ 45.505547] RAX: ffffffffffffffda RBX: 00007ffe47712090 RCX: 00007fa1d099ca0a
> > [ 45.505547] RDX: 0000000000000040 RSI: 0000559b91bbd300 RDI: 0000000000000003
> > [ 45.505547] RBP: 0000559b91bbd300 R08: 00007ffe477142a4 R09: 000000000000001c
> > [ 45.505547] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe47711c20
> > [ 45.505547] R13: 0000000000000040 R14: 0000559b91bbf4f4 R15: 00007ffe47712090
> > [ 45.505547] </TASK>
> > [ 45.505547] Modules linked in: mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
> > [ 45.505547] ---[ end trace 0000000000000000 ]---
> > [ 45.505547] RIP: 0010:netif_rx_internal (arch/x86/include/asm/jump_label.h:27)
> > [ 45.505547] Code: 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 55 48 89 fd 48 83 ec 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 18 31 c0 e9 <c9> 00 00 00 66 90 66 90 48 8d 54 24 10 48 89 ef 65 8b 35 17 9d 11
> > All code
> > ========
> > 0: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
> > 7: 00
> > 8: 0f 1f 40 00 nopl 0x0(%rax)
> > c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> > 11: 55 push %rbp
> > 12: 48 89 fd mov %rdi,%rbp
> > 15: 48 83 ec 20 sub $0x20,%rsp
> > 19: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
> > 20: 00 00
> > 22: 48 89 44 24 18 mov %rax,0x18(%rsp)
> > 27: 31 c0 xor %eax,%eax
> > 29:* e9 c9 00 00 00 jmp 0xf7 <-- trapping instruction
> > 2e: 66 90 xchg %ax,%ax
> > 30: 66 90 xchg %ax,%ax
> > 32: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
> > 37: 48 89 ef mov %rbp,%rdi
> > 3a: 65 gs
> > 3b: 8b .byte 0x8b
> > 3c: 35 .byte 0x35
> > 3d: 17 (bad)
> > 3e: 9d popf
> > 3f: 11 .byte 0x11
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: c9 leave
> > 1: 00 00 add %al,(%rax)
> > 3: 00 66 90 add %ah,-0x70(%rsi)
> > 6: 66 90 xchg %ax,%ax
> > 8: 48 8d 54 24 10 lea 0x10(%rsp),%rdx
> > d: 48 89 ef mov %rbp,%rdi
> > 10: 65 gs
> > 11: 8b .byte 0x8b
> > 12: 35 .byte 0x35
> > 13: 17 (bad)
> > 14: 9d popf
> > 15: 11 .byte 0x11
> > [ 45.505547] RSP: 0018:ffffb106c00f0af8 EFLAGS: 00000246
> > [ 45.505547] RAX: 0000000000000000 RBX: ffff99918827b000 RCX: 0000000000000000
> > [ 45.505547] RDX: 000000000000000a RSI: ffff99918827d000 RDI: ffff9991819e6400
> > [ 45.505547] RBP: ffff9991819e6400 R08: 0000000000000000 R09: 0000000000000068
> > [ 45.505547] R10: ffff999181c104c0 R11: 736f6d6570736575 R12: ffff9991819e6400
> > [ 45.505547] R13: 0000000000000076 R14: 0000000000000000 R15: ffff99918827c000
> > [ 45.505547] FS: 00007fa1d06ca1c0(0000) GS:ffff9991fdc80000(0000) knlGS:0000000000000000
> > [ 45.505547] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 45.505547] CR2: 0000559b91aac240 CR3: 0000000004986000 CR4: 00000000000006f0
> > [ 45.505547] Kernel panic - not syncing: Fatal exception in interrupt
> > [ 45.505547] Kernel Offset: 0x37600000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
>
>
> When hitting the panic, the MPTCP selftest was doing some pings -- v6
> according to the call trace -- between different netns: client, server,
> 2 routers in between with some TC config. See [1] for more details. In
> other words, that's before creating MPTCP connections.
>
> These panics are not easy to reproduce. In fact, we only saw the issue 2
> (maybe 3) times, only when running on Github Actions (without KVM). I
> didn't manage to reproduce it locally.
>
> It is only recently that we have started to use Github Actions to do
> some validations, so I cannot confirm that it is a very recent issue. I
> think the CI hit the same issue a few days ago, on top of bec161add35b
> ("amt: do not use overwrapped cb area"), but there was another issue and
> the debug info have not been stored.
>
> For reference, I originally added info in a Github issue [2]. If the CI
> hits the same bug again, I will add stacktrace there. Please tell me if
> I should cc someone.
>
> If you have any idea what is causing such panic, please tell me. I can
> also add test patches in the MPTCP tree if needed.
>
>
>
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/tree/tools/testing/selftests/net/mptcp/mptcp_connect.sh?id=e327b2372bc0#n171
>
> [2]
> https://github.com/multipath-tcp/mptcp_net-next/issues/471#issuecomment-1894061756
>
>
> Cheers,
> Matt
> --
> Sponsored by the NGI0 Core fund.


2024-01-16 22:23:42

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: Re: Kernel panic in netif_rx_internal after v6 pings between netns

Hi Eric,

Thank you for your quick reply!

16 Jan 2024 20:17:40 Eric Dumazet <[email protected]>:
> On Tue, Jan 16, 2024 at 7:36 PM Matthieu Baerts <[email protected]> wrote:
>> Our MPTCP CIs recently hit some kernel panics when validating the -net
>> tree + 2 pending MPTCP patches. This is on top of e327b2372bc0 ("net:
>> ravb: Fix dma_addr_t truncation in error case").
>>
>> It looks like these panics are not related to MPTCP. That's why I'm
>> sharing that here:
>
> Indeed, this seems an x86 issue to me (jump labels ?)

Thank you, good point!

(I don't know why I always think there is no x86 issue :) )

> are all stack
> traces pointing to the same issue ?

I think so.

We had twice the same stack trace, and another one, sadly not
decoded. But both when doing the same thing (ping6):


# INFO: validating network environment with pings
[ 2211.138427] int3: 0000 [#1] PREEMPT SMP NOPTI
[ 2211.138427] CPU: 0 PID: 21830 Comm: ping Tainted: G                 N 6.7.0-gc6465fa4649b #1
[ 2211.138427] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
[ 2211.138427] RIP: 0010:__netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427] Code: 54 55 53 48 83 ec 78 48 8b 2f 48 89 7c 24 10 48 89 54 24 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 48 89 6c 24 30 e9 <13> 08 00 00 0f 1f 44 00 00 48 8b 85 c8 00 00 00 48 2b 85 c0 00 00
[ 2211.138427] RSP: 0018:ffffb09700003e00 EFLAGS: 00000246
[ 2211.138427] RAX: 0000000000000000 RBX: ffff9eec3dc2ef10 RCX: ffff9eebc6205700
[ 2211.138427] RDX: ffffb09700003eb8 RSI: 0000000000000000 RDI: ffffb09700003eb0
[ 2211.138427] RBP: ffff9eebc6205700 R08: 0000000000000000 R09: 0000000000000048
[ 2211.138427] R10: 00000000000002ff R11: 020000ff01000000 R12: ffff9eebc82b5000
[ 2211.138427] R13: ffff9eec3dc2ee10 R14: 0000000000000000 R15: 0000000000000002
[ 2211.138427] FS:  00007fa1f295b1c0(0000) GS:ffff9eec3dc00000(0000) knlGS:0000000000000000
[ 2211.138427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2211.138427] CR2: 00005595dc9df240 CR3: 0000000004758000 CR4: 00000000000006f0
[ 2211.138427] Call Trace:
[ 2211.138427]  <IRQ>
[ 2211.138427]  ? die+0x37/0x90
[ 2211.138427]  ? exc_int3+0x10b/0x110
[ 2211.138427]  ? asm_exc_int3+0x39/0x40
[ 2211.138427]  ? __netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427]  ? __netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427]  ? ip6_finish_output2+0x209/0x670
[ 2211.138427]  ? ip6_output+0x12d/0x150
[ 2211.138427]  ? unix_stream_read_generic+0x7c4/0xb70
[ 2211.138427]  ? ip6_mtu+0x46/0x50
[ 2211.138427]  __netif_receive_skb_one_core+0x3d/0x80
[ 2211.138427]  process_backlog+0x9d/0x140
[ 2211.138427]  __napi_poll+0x26/0x1b0
[ 2211.138427]  net_rx_action+0x28f/0x300
[ 2211.138427]  __do_softirq+0xc0/0x28b
[ 2211.138427]  do_softirq+0x43/0x60
[ 2211.138427]  </IRQ>
[ 2211.138427]  <TASK>
[ 2211.138427]  __local_bh_enable_ip+0x5c/0x70
[ 2211.138427]  __dev_queue_xmit+0x28e/0xd70
[ 2211.138427]  ip6_finish_output2+0x2d8/0x670
[ 2211.138427]  ? ip6_output+0x12d/0x150
[ 2211.138427]  ? ip6_mtu+0x46/0x50
[ 2211.138427]  ip6_send_skb+0x22/0x70
[ 2211.138427]  rawv6_sendmsg+0xda5/0x10c0
[ 2211.138427]  ? netfs_clear_subrequests+0x63/0x80
[ 2211.138427]  ? netfs_alloc_request+0xec/0x130
[ 2211.138427]  ? folio_add_file_rmap_ptes+0x88/0xb0
[ 2211.138427]  ? set_pte_range+0xe8/0x310
[ 2211.138427]  ? next_uptodate_folio+0x85/0x260
[ 2211.138427]  ? __sock_sendmsg+0x38/0x70
[ 2211.138427]  __sock_sendmsg+0x38/0x70
[ 2211.138427]  ? move_addr_to_kernel.part.0+0x1b/0x60
[ 2211.138427]  __sys_sendto+0xfc/0x160
[ 2211.138427]  ? ktime_get_real_ts64+0x4d/0xf0
[ 2211.138427]  __x64_sys_sendto+0x24/0x30
[ 2211.138427]  do_syscall_64+0xad/0x1a0
[ 2211.138427]  entry_SYSCALL_64_after_hwframe+0x63/0x6b
[ 2211.138427] RIP: 0033:0x7fa1f2c2da0a
[ 2211.138427] Code: d8 64 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 15 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 7e c3 0f 1f 44 00 00 41 54 48 83 ec 30 44 89
[ 2211.138427] RSP: 002b:00007fff0d984668 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 2211.138427] RAX: ffffffffffffffda RBX: 00007fff0d985da0 RCX: 00007fa1f2c2da0a
[ 2211.138427] RDX: 0000000000000040 RSI: 00005595dcf1d300 RDI: 0000000000000003
[ 2211.138427] RBP: 00005595dcf1d300 R08: 00007fff0d987fb4 R09: 000000000000001c
[ 2211.138427] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff0d985930
[ 2211.138427] R13: 0000000000000040 R14: 00005595dcf1f4f4 R15: 00007fff0d985da0
[ 2211.138427]  </TASK>
[ 2211.138427] Modules linked in: tcp_diag act_csum act_pedit cls_fw sch_ingress xt_mark xt_statistic xt_length xt_bpf ipt_REJECT nft_tproxy nf_tproxy_ipv6 nf_tproxy_ipv4 nft_socket nf_socket_ipv4 nf_socket_ipv6 nf_tables sch_netem mptcp_diag inet_diag mptcp_token_test mptcp_crypto_test kunit
[ 2211.138427] ---[ end trace 0000000000000000 ]---
[ 2211.138427] RIP: 0010:__netif_receive_skb_core.constprop.0+0x39/0x10b0
[ 2211.138427] Code: 54 55 53 48 83 ec 78 48 8b 2f 48 89 7c 24 10 48 89 54 24 20 65 48 8b 04 25 28 00 00 00 48 89 44 24 70 31 c0 48 89 6c 24 30 e9 <13> 08 00 00 0f 1f 44 00 00 48 8b 85 c8 00 00 00 48 2b 85 c0 00 00
[ 2211.138427] RSP: 0018:ffffb09700003e00 EFLAGS: 00000246
[ 2211.138427] RAX: 0000000000000000 RBX: ffff9eec3dc2ef10 RCX: ffff9eebc6205700
[ 2211.138427] RDX: ffffb09700003eb8 RSI: 0000000000000000 RDI: ffffb09700003eb0
[ 2211.138427] RBP: ffff9eebc6205700 R08: 0000000000000000 R09: 0000000000000048
[ 2211.138427] R10: 00000000000002ff R11: 020000ff01000000 R12: ffff9eebc82b5000
[ 2211.138427] R13: ffff9eec3dc2ee10 R14: 0000000000000000 R15: 0000000000000002
[ 2211.138427] FS:  00007fa1f295b1c0(0000) GS:ffff9eec3dc00000(0000) knlGS:0000000000000000
[ 2211.138427] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2211.138427] CR2: 00005595dc9df240 CR3: 0000000004758000 CR4: 00000000000006f0
[ 2211.138427] Kernel panic - not syncing: Fatal exception in interrupt
[ 2211.138427] Kernel Offset: 0x1c400000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)


> Let's cc lkml just in case this rings a bell

Thank you! Hopefully there are still people reading lkml :)

Cheers,
Matt
--
Sponsored by the NGI0 Core fund.

2024-01-20 17:54:02

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: Re: Kernel panic in netif_rx_internal after v6 pings between netns

Hi Eric,

On 16/01/2024 22:15, Matthieu Baerts wrote:
> Hi Eric,
>
> Thank you for your quick reply!
>
> 16 Jan 2024 20:17:40 Eric Dumazet <[email protected]>:
>> On Tue, Jan 16, 2024 at 7:36 PM Matthieu Baerts <[email protected]> wrote:
>>> Our MPTCP CIs recently hit some kernel panics when validating the -net
>>> tree + 2 pending MPTCP patches. This is on top of e327b2372bc0 ("net:
>>> ravb: Fix dma_addr_t truncation in error case").
>>>
>>> It looks like these panics are not related to MPTCP. That's why I'm
>>> sharing that here:
>>
>> Indeed, this seems an x86 issue to me (jump labels ?)
>
> Thank you, good point!
>
> (I don't know why I always think there is no x86 issue :) )
>
>> are all stack
>> traces pointing to the same issue ?
>
> I think so.

FYI, I managed to find a commit that seems to be causing the issue:

8e791f7eba4c ("x86/kprobes: Drop removed INT3 handling code")

It is not clear why, but if I revert it, I can no longer reproduce the
issue. I reported the issue to the patch's author and the x86's ML:

https://lore.kernel.org/r/[email protected]

Thank you again for your help.

Cheers,
Matt
--
Sponsored by the NGI0 Core fund.

2024-01-22 18:08:08

by Jakub Kicinski

[permalink] [raw]
Subject: Re: Kernel panic in netif_rx_internal after v6 pings between netns

On Sat, 20 Jan 2024 18:53:50 +0100 Matthieu Baerts wrote:
> FYI, I managed to find a commit that seems to be causing the issue:
>
> 8e791f7eba4c ("x86/kprobes: Drop removed INT3 handling code")
>
> It is not clear why, but if I revert it, I can no longer reproduce the
> issue. I reported the issue to the patch's author and the x86's ML:
>
> https://lore.kernel.org/r/[email protected]
>
> Thank you again for your help.

Hi Matthieu!

Somewhat related. What do you do currently to ignore crashes?
I was seeing a lot of:
https://netdev-2.bots.linux.dev/vmksft-net-mp/results/431181/vm-crash-thr0-2

So I hacked up this function to filter the crash from NIPA CI:
https://github.com/kuba-moo/nipa/blob/master/contest/remote/lib/vm.py#L50
It tries to get first 5 function names from the stack, to form
a "fingerprint". But I seem to recall a discussion at LPC's testing
track that there are existing solutions for generating fingerprints.
Are you aware of any?

(FWIW the crash from above seems to be gone on latest linux.git,
this night's CIs run are crash-free.)

2024-01-22 19:20:26

by Matthieu Baerts (NGI0)

[permalink] [raw]
Subject: Re: Kernel panic in netif_rx_internal after v6 pings between netns

Hi Jakub,

On 22/01/2024 18:28, Jakub Kicinski wrote:

(...)

> Somewhat related. What do you do currently to ignore crashes?

I was wondering why you wanted to ignore crashes :) ... but then I saw
the new "Test ignored" and "Crashes ignored" sections on the status
page. Just to be sure: you don't want to report issues that have not
been introduced by the new patches, right?

We don't need to do that on MPTCP side:
- either it is a new crash with patches that are in reviewed and that's
not impacting others → we test each series individually, not a batch of
series.
- or there are issues with recent patches, not in netdev yet → we fix,
or revert.
- or there is an issue elsewhere, like the kernel panic we reported
here: usually I try to quickly apply a workaround, e.g. applying a fix,
or a revert. I don't think we ever had an issue really impacting us
where we couldn't find a quick solution in one or two days. With the
panic we reported here, ~15% of the tests had an issue, that's "OK" to
have that for a few days/weeks

With fewer tests and a smaller community, it is easier for us to just
say on the ML and weekly meetings: "this is a known issue, please ignore
for the moment". But if possible, I try to add a workaround/fix in our
repo used by the CI and devs (not upstreamed).

For NIPA CI, do you want to do like with the build and compare with a
reference? Or multiple ones to take into account unstable tests? Or
maintain a list of known issues (I think you started to do that,
probably safer/easier for the moment)?

> I was seeing a lot of:
> https://netdev-2.bots.linux.dev/vmksft-net-mp/results/431181/vm-crash-thr0-2
>
> So I hacked up this function to filter the crash from NIPA CI:
> https://github.com/kuba-moo/nipa/blob/master/contest/remote/lib/vm.py#L50
> It tries to get first 5 function names from the stack, to form
> a "fingerprint". But I seem to recall a discussion at LPC's testing
> track that there are existing solutions for generating fingerprints.
> Are you aware of any?

No, sorry. But I guess they are using that with syzkaller, no?

I have to admit that crashes (or warnings) are quite rare, so there was
no need to have an automation there. But if it is easy to have a
fingerprint, I will be interested as well, it can help for the tracking:
to find occurrences of crashes/warnings that are very hard to reproduce.

> (FWIW the crash from above seems to be gone on latest linux.git,
> this night's CIs run are crash-free.)

Good it was quickly fixed!

Cheers,
Matt
--
Sponsored by the NGI0 Core fund.