2023-06-26 14:45:39

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
<[email protected]> wrote:
>
> From: Ian Kumlien <[email protected]>
> Date: Sun, 25 Jun 2023 12:59:54 +0200
>
> > It could actually be that it's related to: rx-gro-list but
> > rx-udp-gro-forwarding makes it trigger quicker... I have yet to
> > trigger it on igb
>
> Hi, the rx-udp-gro-forwarding author here.
>
> (good thing this appeared on IWL, which I read time to time, but please
> Cc netdev next time)
> (thus +Cc Jakub, Eric, and netdev)

Well, two things, it seems like rx-udp-gro-forwarding accelerates it
but the issue is actually in: rx-gro-list

And since i've only been able to trigger it in ixgbe i thought it
might be a driver issue =)

> > On Sat, Jun 24, 2023 at 10:03 PM Ian Kumlien <[email protected]> wrote:
> >>
> >> Hi again,
> >>
> >> I suspect that I have rounded this down to the rx-udp-gro-forwarding
> >> option... I know it's not on by default but....
> >>
> >> So, I have a machine with four nics, all using ixgbe, they are all:
> >> 06:00.0 Ethernet controller: Intel Corporation Ethernet Connection
> >> X553 1GbE (rev 11)
> >> 06:00.1 Ethernet controller: Intel Corporation Ethernet Connection
> >> X553 1GbE (rev 11)
> >> 07:00.0 Ethernet controller: Intel Corporation Ethernet Connection
> >> X553 1GbE (rev 11)
> >> 07:00.1 Ethernet controller: Intel Corporation Ethernet Connection
> >> X553 1GbE (rev 11)
> >>
> >> But I have been playing with various... currently i do:
> >> for interface in eno1 eno2 eno3 eno4 ; do
> >> for offload in ntuple hw-tc-offload rx-gro-list ; do
> >> ethtool -K $interface $offload on > /dev/null
> >> done
> >> ethtool -G $interface rx 8192 tx 8192 > /devYnull
> >> done
> >>
> >> And it all seems to work just fine for my little firewall
> >>
> >> However, enabling rx-udp-gro-forwarding results in the attached oooops
> >> (sorry, can't see more, been recreating by watching shows on HBO
> >> max... )
>
> Where's the mentioned oops? Where's the original message?

Held by the mailing list since i can only get a screenshot of it...
Will attach the latest one to this email
(I wish that i could easily get a larger backtrace but i haven't
looked in further atm)

> Can't this be related to [0]?

Don't know, my main test has been running video streams in the
background - eventually they cause a oops (within 40 minutes or so)
But i doubt it's counted as tunnel data ;)

> rx-udp-gro-forwarding is here for, uhm... 3 years? And UDP GRO in
> general is much longer. Is this a non-mainline kernel?

Mainline 6.3.9 and now 6.4.0

> So many questions :D
>
> >>
> >> The code seems to decode to:
> >> Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff 49 8b 1e 49 8b
> >> ae c8 00 00 00 41 0f b7 86 b8 00 00 00 45 0f b7 a6 b6 00 00 00 <48> 8b
> >> b3 c8 00 00 00 0f b7 8b b6 00 00 00 49 01 ec 48 01 c5 48 8d
> >> All code
> >> ========
> >> 0: c3 ret
> >> 1: 08 66 89 or %ah,-0x77(%rsi)
> >> 4: 5c pop %rsp
> >> 5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
> >> c: 27 (bad)
> >> d: fd std
> >> e: ff (bad)
> >> f: ff 49 8b decl -0x75(%rcx)
> >> 12: 1e (bad)
> >> 13: 49 8b ae c8 00 00 00 mov 0xc8(%r14),%rbp
> >> 1a: 41 0f b7 86 b8 00 00 movzwl 0xb8(%r14),%eax
> >> 21: 00
> >> 22: 45 0f b7 a6 b6 00 00 movzwl 0xb6(%r14),%r12d
> >> 29: 00
> >> 2a:* 48 8b b3 c8 00 00 00 mov 0xc8(%rbx),%rsi <-- trapping instruction
> >> 31: 0f b7 8b b6 00 00 00 movzwl 0xb6(%rbx),%ecx
> >> 38: 49 01 ec add %rbp,%r12
> >> 3b: 48 01 c5 add %rax,%rbp
> >> 3e: 48 rex.W
> >> 3f: 8d .byte 0x8d
> >>
> >> Code starting with the faulting instruction
> >> ===========================================
> >> 0: 48 8b b3 c8 00 00 00 mov 0xc8(%rbx),%rsi
> >> 7: 0f b7 8b b6 00 00 00 movzwl 0xb6(%rbx),%ecx
> >> e: 49 01 ec add %rbp,%r12
> >> 11: 48 01 c5 add %rax,%rbp
> >> 14: 48 rex.W
> >> 15: 8d .byte 0x8d
> >>
> >> But correlating that with the source is beyond me, it could be generic
> >> but i thought i'd send it you first since it's part of the redhat
> >> guide to speeding up udp traffic
> [0]
> https://lore.kernel.org/netdev/[email protected]
>
> Thanks,
> Olek


Attachments:
iKVM_capture (1).jpg (189.24 kB)

2023-06-26 17:16:35

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, 2023-06-26 at 16:25 +0200, Ian Kumlien wrote:
> On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
> <[email protected]> wrote:
> >
> > From: Ian Kumlien <[email protected]>
> > Date: Sun, 25 Jun 2023 12:59:54 +0200
> >
> > > It could actually be that it's related to: rx-gro-list but
> > > rx-udp-gro-forwarding makes it trigger quicker... I have yet to
> > > trigger it on igb
> >
> > Hi, the rx-udp-gro-forwarding author here.
> >
> > (good thing this appeared on IWL, which I read time to time, but please
> > Cc netdev next time)
> > (thus +Cc Jakub, Eric, and netdev)
>
> Well, two things, it seems like rx-udp-gro-forwarding accelerates it
> but the issue is actually in: rx-gro-list
>
> And since i've only been able to trigger it in ixgbe i thought it
> might be a driver issue =)
>
> > > On Sat, Jun 24, 2023 at 10:03 PM Ian Kumlien <[email protected]> wrote:
> > > >
> > > > Hi again,
> > > >
> > > > I suspect that I have rounded this down to the rx-udp-gro-forwarding
> > > > option... I know it's not on by default but....
> > > >
> > > > So, I have a machine with four nics, all using ixgbe, they are all:
> > > > 06:00.0 Ethernet controller: Intel Corporation Ethernet Connection
> > > > X553 1GbE (rev 11)
> > > > 06:00.1 Ethernet controller: Intel Corporation Ethernet Connection
> > > > X553 1GbE (rev 11)
> > > > 07:00.0 Ethernet controller: Intel Corporation Ethernet Connection
> > > > X553 1GbE (rev 11)
> > > > 07:00.1 Ethernet controller: Intel Corporation Ethernet Connection
> > > > X553 1GbE (rev 11)
> > > >
> > > > But I have been playing with various... currently i do:
> > > > for interface in eno1 eno2 eno3 eno4 ; do
> > > > for offload in ntuple hw-tc-offload rx-gro-list ; do
> > > > ethtool -K $interface $offload on > /dev/null
> > > > done
> > > > ethtool -G $interface rx 8192 tx 8192 > /devYnull
> > > > done
> > > >
> > > > And it all seems to work just fine for my little firewall
> > > >
> > > > However, enabling rx-udp-gro-forwarding results in the attached oooops
> > > > (sorry, can't see more, been recreating by watching shows on HBO
> > > > max... )
> >
> > Where's the mentioned oops? Where's the original message?
>
> Held by the mailing list since i can only get a screenshot of it...
> Will attach the latest one to this email

That image is not very useful/does not provide a lot of relevant
information. Could you please use kdump/crash to collect a (decoded)
full stack trace?

> (I wish that i could easily get a larger backtrace but i haven't
> looked in further atm)
>
> > Can't this be related to [0]?
>
> Don't know, my main test has been running video streams in the
> background - eventually they cause a oops (within 40 minutes or so)
> But i doubt it's counted as tunnel data ;)

I read the above as you don't have UDP tunnels in your setup. Am I
correct?

Thanks,

Paolo


2023-06-26 17:32:15

by Alexander Lobakin

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

From: Ian Kumlien <[email protected]>
Date: Mon, 26 Jun 2023 16:25:24 +0200

> On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
> <[email protected]> wrote:
>>
>> From: Ian Kumlien <[email protected]>
>> Date: Sun, 25 Jun 2023 12:59:54 +0200
>>
>>> It could actually be that it's related to: rx-gro-list but
>>> rx-udp-gro-forwarding makes it trigger quicker... I have yet to
>>> trigger it on igb
>>
>> Hi, the rx-udp-gro-forwarding author here.
>>
>> (good thing this appeared on IWL, which I read time to time, but please
>> Cc netdev next time)
>> (thus +Cc Jakub, Eric, and netdev)
>
> Well, two things, it seems like rx-udp-gro-forwarding accelerates it
> but the issue is actually in: rx-gro-list

Do you enable them simultaneously? I remember, when I was adding
gro-fwd, it was working (and working good) as follows:

1. gro-fwd on, gro-list off: gro-fwd
2. gro-fwd off, gro-list on: gro-list
3. gro-fwd on, gro-list on: gro-list

Note that their receive paths are independent[0]: skb_gro_receive_list()
vs skb_gro_receive(), thus I'm still not really sure how gro-fwd can
trigger gro-list's bug.

>
> And since i've only been able to trigger it in ixgbe i thought it
> might be a driver issue =)

Your screenshot says "__udp_gso_segment", which means that the
problematic UDP GRO packet hits the Tx path. Rx is in general
driver-independent. Tx has separate netdev feature ("tx-gso-list"), but
it's not supported by any driver, just software stack. It might be that
your traffic goes through a bridge or tunnel or anything else that
triggers GSO and software segmentation then booms for some reason.
BTW, __udp_gso_segment() is one-liner when the passed skb was
gro-listed[1], so having it in the bug splat could mean the skb didn't
take that route. But hard to say with no full stacktrace.

[...]

>>>> But correlating that with the source is beyond me, it could be generic
>>>> but i thought i'd send it you first since it's part of the redhat
>>>> guide to speeding up udp traffic
>> [0]
>> https://lore.kernel.org/netdev/[email protected]
>>
>> Thanks,
>> Olek

[0]
https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L518
[1]
https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L277

Thanks,
Olek

2023-06-26 17:34:51

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, Jun 26, 2023 at 7:15 PM Alexander Lobakin
<[email protected]> wrote:
>
> From: Ian Kumlien <[email protected]>
> Date: Mon, 26 Jun 2023 16:25:24 +0200
>
> > On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
> > <[email protected]> wrote:
> >>
> >> From: Ian Kumlien <[email protected]>
> >> Date: Sun, 25 Jun 2023 12:59:54 +0200
> >>
> >>> It could actually be that it's related to: rx-gro-list but
> >>> rx-udp-gro-forwarding makes it trigger quicker... I have yet to
> >>> trigger it on igb
> >>
> >> Hi, the rx-udp-gro-forwarding author here.
> >>
> >> (good thing this appeared on IWL, which I read time to time, but please
> >> Cc netdev next time)
> >> (thus +Cc Jakub, Eric, and netdev)
> >
> > Well, two things, it seems like rx-udp-gro-forwarding accelerates it
> > but the issue is actually in: rx-gro-list
>
> Do you enable them simultaneously? I remember, when I was adding
> gro-fwd, it was working (and working good) as follows:
>
> 1. gro-fwd on, gro-list off: gro-fwd
> 2. gro-fwd off, gro-list on: gro-list
> 3. gro-fwd on, gro-list on: gro-list
>
> Note that their receive paths are independent[0]: skb_gro_receive_list()
> vs skb_gro_receive(), thus I'm still not really sure how gro-fwd can
> trigger gro-list's bug.

Neither am I... I have enabled sol via ipmitool now, will try to get a
better capture

> > And since i've only been able to trigger it in ixgbe i thought it
> > might be a driver issue =)
>
> Your screenshot says "__udp_gso_segment", which means that the
> problematic UDP GRO packet hits the Tx path. Rx is in general
> driver-independent. Tx has separate netdev feature ("tx-gso-list"), but
> it's not supported by any driver, just software stack. It might be that
> your traffic goes through a bridge or tunnel or anything else that
> triggers GSO and software segmentation then booms for some reason.
> BTW, __udp_gso_segment() is one-liner when the passed skb was
> gro-listed[1], so having it in the bug splat could mean the skb didn't
> take that route. But hard to say with no full stacktrace.

I do have a UDP tunnel, in wireguard, will disable it.

Beyond that some bridges and veth interfaces, but lets wait for a full trace

> [...]
>
> >>>> But correlating that with the source is beyond me, it could be generic
> >>>> but i thought i'd send it you first since it's part of the redhat
> >>>> guide to speeding up udp traffic
> >> [0]
> >> https://lore.kernel.org/netdev/[email protected]
> >>
> >> Thanks,
> >> Olek
>
> [0]
> https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L518
> [1]
> https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L277
>
> Thanks,
> Olek

2023-06-26 17:35:42

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

There, that didn't take long, even with wireguard disabled

[14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[14079.685456] #PF: supervisor read access in kernel mode
[14079.690686] #PF: error_code(0x0000) - not-present page
[14079.695915] PGD 0 P4D 0
[14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
[14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
[14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
[14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14079.804408] Call Trace:
[14079.806961] <TASK>
[14079.809170] ? __die+0x1a/0x60
[14079.812340] ? page_fault_oops+0x158/0x440
[14079.816551] ? ip6_route_output_flags+0xe3/0x160
[14079.821284] ? exc_page_fault+0x3f4/0x820
[14079.825408] ? update_load_avg+0x77/0x710
[14079.829534] ? asm_exc_page_fault+0x22/0x30
[14079.833836] ? __udp_gso_segment+0x346/0x4f0
[14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
[14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
[14079.847679] ? try_to_wake_up+0x8e/0x5a0
[14079.851713] inet_gso_segment+0x150/0x3c0
[14079.855827] ? vhost_poll_wakeup+0x31/0x40
[14079.860032] skb_mac_gso_segment+0x9b/0x110
[14079.864331] __skb_gso_segment+0xae/0x160
[14079.868455] ? netif_skb_features+0x144/0x290
[14079.872928] validate_xmit_skb+0x167/0x370
[14079.877139] validate_xmit_skb_list+0x43/0x70
[14079.881612] sch_direct_xmit+0x267/0x380
[14079.885641] __qdisc_run+0x140/0x590
[14079.889324] __dev_queue_xmit+0x44d/0xba0
[14079.893450] ? nf_hook_slow+0x3c/0xb0
[14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
[14079.901788] maybe_deliver+0xa9/0x100
[14079.905564] br_flood+0x8a/0x180
[14079.908903] br_handle_frame_finish+0x31f/0x5b0
[14079.913547] br_handle_frame+0x28f/0x3a0
[14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
[14079.921622] ? br_handle_local_finish+0x20/0x20
[14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
[14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
[14079.936946] ? ___slab_alloc+0x4bf/0xaf0
[14079.940986] __netif_receive_skb_list_core+0x107/0x250
[14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
[14079.951660] ? napi_gro_flush+0x97/0xf0
[14079.955604] napi_complete_done+0x69/0x180
[14079.959808] ixgbe_poll+0xe10/0x12e0
[14079.963506] __napi_poll+0x26/0x1b0
[14079.967106] napi_threaded_poll+0x232/0x250
[14079.971405] ? __napi_poll+0x1b0/0x1b0
[14079.975260] kthread+0xee/0x120
[14079.978510] ? kthread_complete_and_exit+0x20/0x20
[14079.983415] ret_from_fork+0x22/0x30
[14079.987102] </TASK>
[14079.989395] Modules linked in: chaoskey
[14079.993347] CR2: 00000000000000c0
[14079.996773] ---[ end trace 0000000000000000 ]---
[14080.018013] pstore: backend (erst) writing error (-28)
[14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
[14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
[14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

On Mon, Jun 26, 2023 at 7:24 PM Ian Kumlien <[email protected]> wrote:
>
> On Mon, Jun 26, 2023 at 7:15 PM Alexander Lobakin
> <[email protected]> wrote:
> >
> > From: Ian Kumlien <[email protected]>
> > Date: Mon, 26 Jun 2023 16:25:24 +0200
> >
> > > On Mon, Jun 26, 2023 at 4:18 PM Alexander Lobakin
> > > <[email protected]> wrote:
> > >>
> > >> From: Ian Kumlien <[email protected]>
> > >> Date: Sun, 25 Jun 2023 12:59:54 +0200
> > >>
> > >>> It could actually be that it's related to: rx-gro-list but
> > >>> rx-udp-gro-forwarding makes it trigger quicker... I have yet to
> > >>> trigger it on igb
> > >>
> > >> Hi, the rx-udp-gro-forwarding author here.
> > >>
> > >> (good thing this appeared on IWL, which I read time to time, but please
> > >> Cc netdev next time)
> > >> (thus +Cc Jakub, Eric, and netdev)
> > >
> > > Well, two things, it seems like rx-udp-gro-forwarding accelerates it
> > > but the issue is actually in: rx-gro-list
> >
> > Do you enable them simultaneously? I remember, when I was adding
> > gro-fwd, it was working (and working good) as follows:
> >
> > 1. gro-fwd on, gro-list off: gro-fwd
> > 2. gro-fwd off, gro-list on: gro-list
> > 3. gro-fwd on, gro-list on: gro-list
> >
> > Note that their receive paths are independent[0]: skb_gro_receive_list()
> > vs skb_gro_receive(), thus I'm still not really sure how gro-fwd can
> > trigger gro-list's bug.
>
> Neither am I... I have enabled sol via ipmitool now, will try to get a
> better capture
>
> > > And since i've only been able to trigger it in ixgbe i thought it
> > > might be a driver issue =)
> >
> > Your screenshot says "__udp_gso_segment", which means that the
> > problematic UDP GRO packet hits the Tx path. Rx is in general
> > driver-independent. Tx has separate netdev feature ("tx-gso-list"), but
> > it's not supported by any driver, just software stack. It might be that
> > your traffic goes through a bridge or tunnel or anything else that
> > triggers GSO and software segmentation then booms for some reason.
> > BTW, __udp_gso_segment() is one-liner when the passed skb was
> > gro-listed[1], so having it in the bug splat could mean the skb didn't
> > take that route. But hard to say with no full stacktrace.
>
> I do have a UDP tunnel, in wireguard, will disable it.
>
> Beyond that some bridges and veth interfaces, but lets wait for a full trace
>
> > [...]
> >
> > >>>> But correlating that with the source is beyond me, it could be generic
> > >>>> but i thought i'd send it you first since it's part of the redhat
> > >>>> guide to speeding up udp traffic
> > >> [0]
> > >> https://lore.kernel.org/netdev/[email protected]
> > >>
> > >> Thanks,
> > >> Olek
> >
> > [0]
> > https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L518
> > [1]
> > https://elixir.bootlin.com/linux/latest/source/net/ipv4/udp_offload.c#L277
> >
> > Thanks,
> > Olek

2023-06-26 18:08:06

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> There, that didn't take long, even with wireguard disabled
>
> [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> [14079.685456] #PF: supervisor read access in kernel mode
> [14079.690686] #PF: error_code(0x0000) - not-present page
> [14079.695915] PGD 0 P4D 0
> [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14079.804408] Call Trace:
> [14079.806961] <TASK>
> [14079.809170] ? __die+0x1a/0x60
> [14079.812340] ? page_fault_oops+0x158/0x440
> [14079.816551] ? ip6_route_output_flags+0xe3/0x160
> [14079.821284] ? exc_page_fault+0x3f4/0x820
> [14079.825408] ? update_load_avg+0x77/0x710
> [14079.829534] ? asm_exc_page_fault+0x22/0x30
> [14079.833836] ? __udp_gso_segment+0x346/0x4f0
> [14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
> [14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
> [14079.847679] ? try_to_wake_up+0x8e/0x5a0
> [14079.851713] inet_gso_segment+0x150/0x3c0
> [14079.855827] ? vhost_poll_wakeup+0x31/0x40
> [14079.860032] skb_mac_gso_segment+0x9b/0x110
> [14079.864331] __skb_gso_segment+0xae/0x160
> [14079.868455] ? netif_skb_features+0x144/0x290
> [14079.872928] validate_xmit_skb+0x167/0x370
> [14079.877139] validate_xmit_skb_list+0x43/0x70
> [14079.881612] sch_direct_xmit+0x267/0x380
> [14079.885641] __qdisc_run+0x140/0x590
> [14079.889324] __dev_queue_xmit+0x44d/0xba0
> [14079.893450] ? nf_hook_slow+0x3c/0xb0
> [14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
> [14079.901788] maybe_deliver+0xa9/0x100
> [14079.905564] br_flood+0x8a/0x180
> [14079.908903] br_handle_frame_finish+0x31f/0x5b0
> [14079.913547] br_handle_frame+0x28f/0x3a0
> [14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
> [14079.921622] ? br_handle_local_finish+0x20/0x20
> [14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> [14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
> [14079.936946] ? ___slab_alloc+0x4bf/0xaf0
> [14079.940986] __netif_receive_skb_list_core+0x107/0x250
> [14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
> [14079.951660] ? napi_gro_flush+0x97/0xf0
> [14079.955604] napi_complete_done+0x69/0x180
> [14079.959808] ixgbe_poll+0xe10/0x12e0
> [14079.963506] __napi_poll+0x26/0x1b0
> [14079.967106] napi_threaded_poll+0x232/0x250
> [14079.971405] ? __napi_poll+0x1b0/0x1b0
> [14079.975260] kthread+0xee/0x120
> [14079.978510] ? kthread_complete_and_exit+0x20/0x20
> [14079.983415] ret_from_fork+0x22/0x30
> [14079.987102] </TASK>
> [14079.989395] Modules linked in: chaoskey
> [14079.993347] CR2: 00000000000000c0
> [14079.996773] ---[ end trace 0000000000000000 ]---
> [14080.018013] pstore: backend (erst) writing error (-28)
> [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> interrupt ]---

Could you please provide a decoded stack trace?

# in your git tree:
cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux

Thanks!

Paolo


2023-06-26 18:15:51

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, Jun 26, 2023 at 7:56 PM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> > There, that didn't take long, even with wireguard disabled
> >
> > [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > [14079.685456] #PF: supervisor read access in kernel mode
> > [14079.690686] #PF: error_code(0x0000) - not-present page
> > [14079.695915] PGD 0 P4D 0
> > [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> > [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14079.804408] Call Trace:
> > [14079.806961] <TASK>
> > [14079.809170] ? __die+0x1a/0x60
> > [14079.812340] ? page_fault_oops+0x158/0x440
> > [14079.816551] ? ip6_route_output_flags+0xe3/0x160
> > [14079.821284] ? exc_page_fault+0x3f4/0x820
> > [14079.825408] ? update_load_avg+0x77/0x710
> > [14079.829534] ? asm_exc_page_fault+0x22/0x30
> > [14079.833836] ? __udp_gso_segment+0x346/0x4f0
> > [14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
> > [14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
> > [14079.847679] ? try_to_wake_up+0x8e/0x5a0
> > [14079.851713] inet_gso_segment+0x150/0x3c0
> > [14079.855827] ? vhost_poll_wakeup+0x31/0x40
> > [14079.860032] skb_mac_gso_segment+0x9b/0x110
> > [14079.864331] __skb_gso_segment+0xae/0x160
> > [14079.868455] ? netif_skb_features+0x144/0x290
> > [14079.872928] validate_xmit_skb+0x167/0x370
> > [14079.877139] validate_xmit_skb_list+0x43/0x70
> > [14079.881612] sch_direct_xmit+0x267/0x380
> > [14079.885641] __qdisc_run+0x140/0x590
> > [14079.889324] __dev_queue_xmit+0x44d/0xba0
> > [14079.893450] ? nf_hook_slow+0x3c/0xb0
> > [14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
> > [14079.901788] maybe_deliver+0xa9/0x100
> > [14079.905564] br_flood+0x8a/0x180
> > [14079.908903] br_handle_frame_finish+0x31f/0x5b0
> > [14079.913547] br_handle_frame+0x28f/0x3a0
> > [14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
> > [14079.921622] ? br_handle_local_finish+0x20/0x20
> > [14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> > [14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
> > [14079.936946] ? ___slab_alloc+0x4bf/0xaf0
> > [14079.940986] __netif_receive_skb_list_core+0x107/0x250
> > [14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
> > [14079.951660] ? napi_gro_flush+0x97/0xf0
> > [14079.955604] napi_complete_done+0x69/0x180
> > [14079.959808] ixgbe_poll+0xe10/0x12e0
> > [14079.963506] __napi_poll+0x26/0x1b0
> > [14079.967106] napi_threaded_poll+0x232/0x250
> > [14079.971405] ? __napi_poll+0x1b0/0x1b0
> > [14079.975260] kthread+0xee/0x120
> > [14079.978510] ? kthread_complete_and_exit+0x20/0x20
> > [14079.983415] ret_from_fork+0x22/0x30
> > [14079.987102] </TASK>
> > [14079.989395] Modules linked in: chaoskey
> > [14079.993347] CR2: 00000000000000c0
> > [14079.996773] ---[ end trace 0000000000000000 ]---
> > [14080.018013] pstore: backend (erst) writing error (-28)
> > [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> > [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> > interrupt ]---
>
> Could you please provide a decoded stack trace?
>
> # in your git tree:
> cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux

I'm afraid it doesn't yield more information, really... I can't say why

cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[14079.685456] #PF: supervisor read access in kernel mode
[14079.690686] #PF: error_code(0x0000) - not-present page
[14079.695915] PGD 0 P4D 0
[14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
[14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
[14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[14079.717796] RIP: 0010:__udp_gso_segment (??:?)
[14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff

Code starting with the faulting instruction
===========================================
0: c3 ret
1: 08 66 89 or %ah,-0x77(%rsi)
4: 5c pop %rsp
5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
c: 27 (bad)
d: fd std
e: ff (bad)
f: ff .byte 0xff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14079.804408] Call Trace:
[14079.806961] <TASK>
[14079.809170] ? __die (??:?)
[14079.812340] ? page_fault_oops (fault.c:?)
[14079.816551] ? ip6_route_output_flags (??:?)
[14079.821284] ? exc_page_fault (??:?)
[14079.825408] ? update_load_avg (fair.c:?)
[14079.829534] ? asm_exc_page_fault (??:?)
[14079.833836] ? __udp_gso_segment (??:?)
[14079.838218] ? __udp_gso_segment (??:?)
[14079.842600] ? _raw_spin_unlock_irqrestore (??:?)
[14079.847679] ? try_to_wake_up (core.c:?)
[14079.851713] inet_gso_segment (??:?)
[14079.855827] ? vhost_poll_wakeup (vhost.c:?)
[14079.860032] skb_mac_gso_segment (??:?)
[14079.864331] __skb_gso_segment (??:?)
[14079.868455] ? netif_skb_features (??:?)
[14079.872928] validate_xmit_skb (dev.c:?)
[14079.877139] validate_xmit_skb_list (??:?)
[14079.881612] sch_direct_xmit (??:?)
[14079.885641] __qdisc_run (??:?)
[14079.889324] __dev_queue_xmit (??:?)
[14079.893450] ? nf_hook_slow (??:?)
[14079.897229] br_dev_queue_push_xmit (??:?)
[14079.901788] maybe_deliver (br_forward.c:?)
[14079.905564] br_flood (??:?)
[14079.908903] br_handle_frame_finish (??:?)
[14079.913547] br_handle_frame (br_input.c:?)
[14079.917585] ? ipv6_find_hdr (??:?)
[14079.921622] ? br_handle_local_finish (??:?)
[14079.926267] __netif_receive_skb_core.constprop.0 (dev.c:?)
[14079.932125] ? br_handle_frame_finish (br_input.c:?)
[14079.936946] ? ___slab_alloc (slub.c:?)
[14079.940986] __netif_receive_skb_list_core (dev.c:?)
[14079.946240] netif_receive_skb_list_internal (??:?)
[14079.951660] ? napi_gro_flush (??:?)
[14079.955604] napi_complete_done (??:?)
[14079.959808] ixgbe_poll (??:?)
[14079.963506] __napi_poll (dev.c:?)
[14079.967106] napi_threaded_poll (dev.c:?)
[14079.971405] ? __napi_poll (dev.c:?)
[14079.975260] kthread (kthread.c:?)
[14079.978510] ? kthread_complete_and_exit (kthread.c:?)
[14079.983415] ret_from_fork (??:?)
[14079.987102] </TASK>
[14079.989395] Modules linked in: chaoskey
[14079.993347] CR2: 00000000000000c0
[14079.996773] ---[ end trace 0000000000000000 ]---
[14080.018013] pstore: backend (erst) writing error (-28)
[14080.023274] RIP: 0010:__udp_gso_segment (??:?)
[14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff

Code starting with the faulting instruction
===========================================
0: c3 ret
1: 08 66 89 or %ah,-0x77(%rsi)
4: 5c pop %rsp
5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
c: 27 (bad)
d: fd std
e: ff (bad)
f: ff .byte 0xff
49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
48 8d
[14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
[14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
[14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
[14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
[14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
[14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
knlGS:0000000000000000
[14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
[14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
[14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

The binaries aren't stripped so i don't, currently, know why it's like this...

but i also get:
gdb vmlinux
GNU gdb (Gentoo 13.2 vanilla) 13.2
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-pc-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://bugs.gentoo.org/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from vmlinux...
(No debugging symbols found in vmlinux)
Traceback (most recent call last):
File "/usr/src/linux/vmlinux-gdb.py", line 25, in <module>
import linux.constants
File "/usr/src/linux/scripts/gdb/linux/constants.py", line 10, in <module>
LX_hrtimer_resolution = gdb.parse_and_eval("hrtimer_resolution")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
gdb.error: 'hrtimer_resolution' has unknown type; cast it to its declared type
---

> Thanks!
>
> Paolo
>

2023-06-26 19:17:43

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

Nevermind, I think I found it, I will loop this thing until I have a
proper trace....

On Mon, Jun 26, 2023 at 8:01 PM Ian Kumlien <[email protected]> wrote:
>
> On Mon, Jun 26, 2023 at 7:56 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> > > There, that didn't take long, even with wireguard disabled
> > >
> > > [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > > [14079.685456] #PF: supervisor read access in kernel mode
> > > [14079.690686] #PF: error_code(0x0000) - not-present page
> > > [14079.695915] PGD 0 P4D 0
> > > [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> > > [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > > BIOS 1.7a 10/13/2022
> > > [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > > [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > > 48 8d
> > > [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > > [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > > [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > > [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > > [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > > [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > > [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > > knlGS:0000000000000000
> > > [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > > [14079.804408] Call Trace:
> > > [14079.806961] <TASK>
> > > [14079.809170] ? __die+0x1a/0x60
> > > [14079.812340] ? page_fault_oops+0x158/0x440
> > > [14079.816551] ? ip6_route_output_flags+0xe3/0x160
> > > [14079.821284] ? exc_page_fault+0x3f4/0x820
> > > [14079.825408] ? update_load_avg+0x77/0x710
> > > [14079.829534] ? asm_exc_page_fault+0x22/0x30
> > > [14079.833836] ? __udp_gso_segment+0x346/0x4f0
> > > [14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
> > > [14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
> > > [14079.847679] ? try_to_wake_up+0x8e/0x5a0
> > > [14079.851713] inet_gso_segment+0x150/0x3c0
> > > [14079.855827] ? vhost_poll_wakeup+0x31/0x40
> > > [14079.860032] skb_mac_gso_segment+0x9b/0x110
> > > [14079.864331] __skb_gso_segment+0xae/0x160
> > > [14079.868455] ? netif_skb_features+0x144/0x290
> > > [14079.872928] validate_xmit_skb+0x167/0x370
> > > [14079.877139] validate_xmit_skb_list+0x43/0x70
> > > [14079.881612] sch_direct_xmit+0x267/0x380
> > > [14079.885641] __qdisc_run+0x140/0x590
> > > [14079.889324] __dev_queue_xmit+0x44d/0xba0
> > > [14079.893450] ? nf_hook_slow+0x3c/0xb0
> > > [14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
> > > [14079.901788] maybe_deliver+0xa9/0x100
> > > [14079.905564] br_flood+0x8a/0x180
> > > [14079.908903] br_handle_frame_finish+0x31f/0x5b0
> > > [14079.913547] br_handle_frame+0x28f/0x3a0
> > > [14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
> > > [14079.921622] ? br_handle_local_finish+0x20/0x20
> > > [14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> > > [14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
> > > [14079.936946] ? ___slab_alloc+0x4bf/0xaf0
> > > [14079.940986] __netif_receive_skb_list_core+0x107/0x250
> > > [14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
> > > [14079.951660] ? napi_gro_flush+0x97/0xf0
> > > [14079.955604] napi_complete_done+0x69/0x180
> > > [14079.959808] ixgbe_poll+0xe10/0x12e0
> > > [14079.963506] __napi_poll+0x26/0x1b0
> > > [14079.967106] napi_threaded_poll+0x232/0x250
> > > [14079.971405] ? __napi_poll+0x1b0/0x1b0
> > > [14079.975260] kthread+0xee/0x120
> > > [14079.978510] ? kthread_complete_and_exit+0x20/0x20
> > > [14079.983415] ret_from_fork+0x22/0x30
> > > [14079.987102] </TASK>
> > > [14079.989395] Modules linked in: chaoskey
> > > [14079.993347] CR2: 00000000000000c0
> > > [14079.996773] ---[ end trace 0000000000000000 ]---
> > > [14080.018013] pstore: backend (erst) writing error (-28)
> > > [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > > [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > > 48 8d
> > > [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > > [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > > [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > > [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > > [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > > [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > > [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > > knlGS:0000000000000000
> > > [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > > [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> > > [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> > > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> > > interrupt ]---
> >
> > Could you please provide a decoded stack trace?
> >
> > # in your git tree:
> > cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux
>
> I'm afraid it doesn't yield more information, really... I can't say why
>
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> [14079.685456] #PF: supervisor read access in kernel mode
> [14079.690686] #PF: error_code(0x0000) - not-present page
> [14079.695915] PGD 0 P4D 0
> [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [14079.717796] RIP: 0010:__udp_gso_segment (??:?)
> [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
>
> Code starting with the faulting instruction
> ===========================================
> 0: c3 ret
> 1: 08 66 89 or %ah,-0x77(%rsi)
> 4: 5c pop %rsp
> 5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
> c: 27 (bad)
> d: fd std
> e: ff (bad)
> f: ff .byte 0xff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14079.804408] Call Trace:
> [14079.806961] <TASK>
> [14079.809170] ? __die (??:?)
> [14079.812340] ? page_fault_oops (fault.c:?)
> [14079.816551] ? ip6_route_output_flags (??:?)
> [14079.821284] ? exc_page_fault (??:?)
> [14079.825408] ? update_load_avg (fair.c:?)
> [14079.829534] ? asm_exc_page_fault (??:?)
> [14079.833836] ? __udp_gso_segment (??:?)
> [14079.838218] ? __udp_gso_segment (??:?)
> [14079.842600] ? _raw_spin_unlock_irqrestore (??:?)
> [14079.847679] ? try_to_wake_up (core.c:?)
> [14079.851713] inet_gso_segment (??:?)
> [14079.855827] ? vhost_poll_wakeup (vhost.c:?)
> [14079.860032] skb_mac_gso_segment (??:?)
> [14079.864331] __skb_gso_segment (??:?)
> [14079.868455] ? netif_skb_features (??:?)
> [14079.872928] validate_xmit_skb (dev.c:?)
> [14079.877139] validate_xmit_skb_list (??:?)
> [14079.881612] sch_direct_xmit (??:?)
> [14079.885641] __qdisc_run (??:?)
> [14079.889324] __dev_queue_xmit (??:?)
> [14079.893450] ? nf_hook_slow (??:?)
> [14079.897229] br_dev_queue_push_xmit (??:?)
> [14079.901788] maybe_deliver (br_forward.c:?)
> [14079.905564] br_flood (??:?)
> [14079.908903] br_handle_frame_finish (??:?)
> [14079.913547] br_handle_frame (br_input.c:?)
> [14079.917585] ? ipv6_find_hdr (??:?)
> [14079.921622] ? br_handle_local_finish (??:?)
> [14079.926267] __netif_receive_skb_core.constprop.0 (dev.c:?)
> [14079.932125] ? br_handle_frame_finish (br_input.c:?)
> [14079.936946] ? ___slab_alloc (slub.c:?)
> [14079.940986] __netif_receive_skb_list_core (dev.c:?)
> [14079.946240] netif_receive_skb_list_internal (??:?)
> [14079.951660] ? napi_gro_flush (??:?)
> [14079.955604] napi_complete_done (??:?)
> [14079.959808] ixgbe_poll (??:?)
> [14079.963506] __napi_poll (dev.c:?)
> [14079.967106] napi_threaded_poll (dev.c:?)
> [14079.971405] ? __napi_poll (dev.c:?)
> [14079.975260] kthread (kthread.c:?)
> [14079.978510] ? kthread_complete_and_exit (kthread.c:?)
> [14079.983415] ret_from_fork (??:?)
> [14079.987102] </TASK>
> [14079.989395] Modules linked in: chaoskey
> [14079.993347] CR2: 00000000000000c0
> [14079.996773] ---[ end trace 0000000000000000 ]---
> [14080.018013] pstore: backend (erst) writing error (-28)
> [14080.023274] RIP: 0010:__udp_gso_segment (??:?)
> [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
>
> Code starting with the faulting instruction
> ===========================================
> 0: c3 ret
> 1: 08 66 89 or %ah,-0x77(%rsi)
> 4: 5c pop %rsp
> 5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
> c: 27 (bad)
> d: fd std
> e: ff (bad)
> f: ff .byte 0xff
> 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> 48 8d
> [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> knlGS:0000000000000000
> [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> interrupt ]---
>
> The binaries aren't stripped so i don't, currently, know why it's like this...
>
> but i also get:
> gdb vmlinux
> GNU gdb (Gentoo 13.2 vanilla) 13.2
> Copyright (C) 2023 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
> Type "show copying" and "show warranty" for details.
> This GDB was configured as "x86_64-pc-linux-gnu".
> Type "show configuration" for configuration details.
> For bug reporting instructions, please see:
> <https://bugs.gentoo.org/>.
> Find the GDB manual and other documentation resources online at:
> <http://www.gnu.org/software/gdb/documentation/>.
>
> For help, type "help".
> Type "apropos word" to search for commands related to "word"...
> Reading symbols from vmlinux...
> (No debugging symbols found in vmlinux)
> Traceback (most recent call last):
> File "/usr/src/linux/vmlinux-gdb.py", line 25, in <module>
> import linux.constants
> File "/usr/src/linux/scripts/gdb/linux/constants.py", line 10, in <module>
> LX_hrtimer_resolution = gdb.parse_and_eval("hrtimer_resolution")
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> gdb.error: 'hrtimer_resolution' has unknown type; cast it to its declared type
> ---
>
> > Thanks!
> >
> > Paolo
> >

2023-06-26 19:25:05

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <[email protected]> wrote:
>
> Nevermind, I think I found it, I will loop this thing until I have a
> proper trace....

Still some question marks, but much better

cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
[ 62.631083] #PF: supervisor read access in kernel mode
[ 62.636312] #PF: error_code(0x0000) - not-present page
[ 62.641541] PGD 0 P4D 0
[ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
[ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 62.663344] RIP: 0010:__udp_gso_segment
(./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
net/ipv4/udp_offload.c:277)
[ 62.668329] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff 49
8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2 00
00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5 48
8d
All code
========
0: c3 ret
1: 08 66 89 or %ah,-0x77(%rsi)
4: 5c pop %rsp
5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
c: 27 (bad)
d: fd std
e: ff (bad)
f: ff 49 8b decl -0x75(%rcx)
12: 1e (bad)
13: 49 8b ae c0 00 00 00 mov 0xc0(%r14),%rbp
1a: 41 0f b7 86 b4 00 00 movzwl 0xb4(%r14),%eax
21: 00
22: 45 0f b7 a6 b2 00 00 movzwl 0xb2(%r14),%r12d
29: 00
2a:* 48 8b b3 c0 00 00 00 mov 0xc0(%rbx),%rsi <-- trapping instruction
31: 0f b7 8b b2 00 00 00 movzwl 0xb2(%rbx),%ecx
38: 49 01 ec add %rbp,%r12
3b: 48 01 c5 add %rax,%rbp
3e: 48 rex.W
3f: 8d .byte 0x8d

Code starting with the faulting instruction
===========================================
0: 48 8b b3 c0 00 00 00 mov 0xc0(%rbx),%rsi
7: 0f b7 8b b2 00 00 00 movzwl 0xb2(%rbx),%ecx
e: 49 01 ec add %rbp,%r12
11: 48 01 c5 add %rax,%rbp
14: 48 rex.W
15: 8d .byte 0x8d
[ 62.687193] RSP: 0018:ffffbd3a83b4f868 EFLAGS: 00010246
[ 62.692515] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[ 62.699743] RDX: ffffa124def8a000 RSI: 0000000000000079 RDI: ffffa125952a14d4
[ 62.706970] RBP: ffffa124def8a000 R08: 0000000000000022 R09: 00002000001558c9
[ 62.714199] R10: 0000000000000000 R11: 00000000be554639 R12: 00000000000000e2
[ 62.721426] R13: ffffa125952a1400 R14: ffffa125952a1400 R15: 00002000001558c9
[ 62.728654] FS: 0000000000000000(0000) GS:ffffa127efa40000(0000)
knlGS:0000000000000000
[ 62.736852] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 62.742702] CR2: 00000000000000c0 CR3: 00000001034b0000 CR4: 00000000003526e0
[ 62.749948] Call Trace:
[ 62.752498] <TASK>
[ 62.754708] ? __die (arch/x86/kernel/dumpstack.c:421
arch/x86/kernel/dumpstack.c:434)
[ 62.757879] ? page_fault_oops (arch/x86/mm/fault.c:707)
[ 62.762093] ? exc_page_fault (arch/x86/mm/fault.c:1285
arch/x86/mm/fault.c:1534 arch/x86/mm/fault.c:1590)
[ 62.766215] ? asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:570)
[ 62.770508] ? __udp_gso_segment (./include/linux/skbuff.h:2858
./include/linux/udp.h:23 net/ipv4/udp_offload.c:228
net/ipv4/udp_offload.c:261 net/ipv4/udp_offload.c:277)
[ 62.774890] ? __udp_gso_segment (net/ipv4/udp_offload.c:255
net/ipv4/udp_offload.c:277)
[ 62.779267] inet_gso_segment (net/ipv4/af_inet.c:1398)
[ 62.783392] ? __wake_up_common (kernel/sched/wait.c:108)
[ 62.787605] skb_mac_gso_segment (net/core/gro.c:141)
[ 62.791906] __skb_gso_segment (net/core/dev.c:3403 (discriminator 2))
[ 62.796029] ? netif_skb_features (net/core/dev.c:3474 net/core/dev.c:3563)
[ 62.800492] validate_xmit_skb (./include/linux/netdevice.h:4862
net/core/dev.c:3659)
[ 62.804695] validate_xmit_skb_list (net/core/dev.c:3710)
[ 62.809158] sch_direct_xmit (net/sched/sch_generic.c:330)
[ 62.813198] __dev_queue_xmit (net/core/dev.c:3805 net/core/dev.c:4210)
[ 62.817314] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 62.821093] br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 62.825652] maybe_deliver (net/bridge/br_forward.c:193)
[ 62.829420] br_flood (net/bridge/br_forward.c:233)
[ 62.832758] br_handle_frame_finish (net/bridge/br_input.c:215)
[ 62.837403] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 62.841431] ? ip6gre_tunnel_siocdevprivate (net/ipv6/ip6_gre.c:295
net/ipv6/ip6_gre.c:311 net/ipv6/ip6_gre.c:1325)
[ 62.846771] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 62.851417] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 62.857273] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 62.862086] ? sched_clock_cpu (kernel/sched/clock.c:387)
[ 62.866114] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 62.871367] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 62.876795] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[ 62.881004] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[ 62.884689] ? finish_task_switch.isra.0
(./arch/x86/include/asm/irqflags.h:42
./arch/x86/include/asm/irqflags.h:77 kernel/sched/sched.h:1378
kernel/sched/core.c:5095 kernel/sched/core.c:5213)
[ 62.889679] ? __napi_poll (net/core/dev.c:6625)
[ 62.893534] __napi_poll (net/core/dev.c:6498)
[ 62.897133] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 62.901422] ? __napi_poll (net/core/dev.c:6625)
[ 62.905276] kthread (kernel/kthread.c:379)
[ 62.908529] ? kthread_complete_and_exit (kernel/kthread.c:332)
[ 62.913435] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 62.917119] </TASK>
[ 62.919411] Modules linked in: chaoskey
[ 62.923357] CR2: 00000000000000c0
[ 62.926782] ---[ end trace 0000000000000000 ]---
[ 62.947865] pstore: backend (erst) writing error (-28)
[ 62.953125] RIP: 0010:__udp_gso_segment
(./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
net/ipv4/udp_offload.c:277)
[ 62.958119] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff 49
8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2 00
00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5 48
8d
All code
========
0: c3 ret
1: 08 66 89 or %ah,-0x77(%rsi)
4: 5c pop %rsp
5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
c: 27 (bad)
d: fd std
e: ff (bad)
f: ff 49 8b decl -0x75(%rcx)
12: 1e (bad)
13: 49 8b ae c0 00 00 00 mov 0xc0(%r14),%rbp
1a: 41 0f b7 86 b4 00 00 movzwl 0xb4(%r14),%eax
21: 00
22: 45 0f b7 a6 b2 00 00 movzwl 0xb2(%r14),%r12d
29: 00
2a:* 48 8b b3 c0 00 00 00 mov 0xc0(%rbx),%rsi <-- trapping instruction
31: 0f b7 8b b2 00 00 00 movzwl 0xb2(%rbx),%ecx
38: 49 01 ec add %rbp,%r12
3b: 48 01 c5 add %rax,%rbp
3e: 48 rex.W
3f: 8d .byte 0x8d

Code starting with the faulting instruction
===========================================
0: 48 8b b3 c0 00 00 00 mov 0xc0(%rbx),%rsi
7: 0f b7 8b b2 00 00 00 movzwl 0xb2(%rbx),%ecx
e: 49 01 ec add %rbp,%r12
11: 48 01 c5 add %rax,%rbp
14: 48 rex.W
15: 8d .byte 0x8d
[ 62.977037] RSP: 0018:ffffbd3a83b4f868 EFLAGS: 00010246
[ 62.982376] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
[ 62.989621] RDX: ffffa124def8a000 RSI: 0000000000000079 RDI: ffffa125952a14d4
[ 62.996864] RBP: ffffa124def8a000 R08: 0000000000000022 R09: 00002000001558c9
[ 63.004111] R10: 0000000000000000 R11: 00000000be554639 R12: 00000000000000e2
[ 63.011355] R13: ffffa125952a1400 R14: ffffa125952a1400 R15: 00002000001558c9
[ 63.018602] FS: 0000000000000000(0000) GS:ffffa127efa40000(0000)
knlGS:0000000000000000
[ 63.026816] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 63.032666] CR2: 00000000000000c0 CR3: 00000001034b0000 CR4: 00000000003526e0
[ 63.039911] Kernel panic - not syncing: Fatal exception in interrupt
[ 63.218844] Kernel Offset: 0x35e00000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 63.246284] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

> On Mon, Jun 26, 2023 at 8:01 PM Ian Kumlien <[email protected]> wrote:
> >
> > On Mon, Jun 26, 2023 at 7:56 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Mon, 2023-06-26 at 19:30 +0200, Ian Kumlien wrote:
> > > > There, that didn't take long, even with wireguard disabled
> > > >
> > > > [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > > > [14079.685456] #PF: supervisor read access in kernel mode
> > > > [14079.690686] #PF: error_code(0x0000) - not-present page
> > > > [14079.695915] PGD 0 P4D 0
> > > > [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > > [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> > > > [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > > > BIOS 1.7a 10/13/2022
> > > > [14079.717796] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > > > [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > > > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > > > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > > > 48 8d
> > > > [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > > > [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > > > [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > > > [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > > > [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > > > [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > > > [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > > > knlGS:0000000000000000
> > > > [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > > > [14079.804408] Call Trace:
> > > > [14079.806961] <TASK>
> > > > [14079.809170] ? __die+0x1a/0x60
> > > > [14079.812340] ? page_fault_oops+0x158/0x440
> > > > [14079.816551] ? ip6_route_output_flags+0xe3/0x160
> > > > [14079.821284] ? exc_page_fault+0x3f4/0x820
> > > > [14079.825408] ? update_load_avg+0x77/0x710
> > > > [14079.829534] ? asm_exc_page_fault+0x22/0x30
> > > > [14079.833836] ? __udp_gso_segment+0x346/0x4f0
> > > > [14079.838218] ? __udp_gso_segment+0x2fa/0x4f0
> > > > [14079.842600] ? _raw_spin_unlock_irqrestore+0x16/0x30
> > > > [14079.847679] ? try_to_wake_up+0x8e/0x5a0
> > > > [14079.851713] inet_gso_segment+0x150/0x3c0
> > > > [14079.855827] ? vhost_poll_wakeup+0x31/0x40
> > > > [14079.860032] skb_mac_gso_segment+0x9b/0x110
> > > > [14079.864331] __skb_gso_segment+0xae/0x160
> > > > [14079.868455] ? netif_skb_features+0x144/0x290
> > > > [14079.872928] validate_xmit_skb+0x167/0x370
> > > > [14079.877139] validate_xmit_skb_list+0x43/0x70
> > > > [14079.881612] sch_direct_xmit+0x267/0x380
> > > > [14079.885641] __qdisc_run+0x140/0x590
> > > > [14079.889324] __dev_queue_xmit+0x44d/0xba0
> > > > [14079.893450] ? nf_hook_slow+0x3c/0xb0
> > > > [14079.897229] br_dev_queue_push_xmit+0xb2/0x1c0
> > > > [14079.901788] maybe_deliver+0xa9/0x100
> > > > [14079.905564] br_flood+0x8a/0x180
> > > > [14079.908903] br_handle_frame_finish+0x31f/0x5b0
> > > > [14079.913547] br_handle_frame+0x28f/0x3a0
> > > > [14079.917585] ? ipv6_find_hdr+0x1f0/0x3e0
> > > > [14079.921622] ? br_handle_local_finish+0x20/0x20
> > > > [14079.926267] __netif_receive_skb_core.constprop.0+0x4c5/0xc90
> > > > [14079.932125] ? br_handle_frame_finish+0x5b0/0x5b0
> > > > [14079.936946] ? ___slab_alloc+0x4bf/0xaf0
> > > > [14079.940986] __netif_receive_skb_list_core+0x107/0x250
> > > > [14079.946240] netif_receive_skb_list_internal+0x194/0x2b0
> > > > [14079.951660] ? napi_gro_flush+0x97/0xf0
> > > > [14079.955604] napi_complete_done+0x69/0x180
> > > > [14079.959808] ixgbe_poll+0xe10/0x12e0
> > > > [14079.963506] __napi_poll+0x26/0x1b0
> > > > [14079.967106] napi_threaded_poll+0x232/0x250
> > > > [14079.971405] ? __napi_poll+0x1b0/0x1b0
> > > > [14079.975260] kthread+0xee/0x120
> > > > [14079.978510] ? kthread_complete_and_exit+0x20/0x20
> > > > [14079.983415] ret_from_fork+0x22/0x30
> > > > [14079.987102] </TASK>
> > > > [14079.989395] Modules linked in: chaoskey
> > > > [14079.993347] CR2: 00000000000000c0
> > > > [14079.996773] ---[ end trace 0000000000000000 ]---
> > > > [14080.018013] pstore: backend (erst) writing error (-28)
> > > > [14080.023274] RIP: 0010:__udp_gso_segment+0x346/0x4f0
> > > > [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> > > > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > > > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > > > 48 8d
> > > > [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > > > [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > > > [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > > > [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > > > [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > > > [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > > > [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > > > knlGS:0000000000000000
> > > > [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > > > [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> > > > [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> > > > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > > [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> > > > interrupt ]---
> > >
> > > Could you please provide a decoded stack trace?
> > >
> > > # in your git tree:
> > > cat <stacktrace file > | ./scripts/decode_stacktrace.sh vmlinux
> >
> > I'm afraid it doesn't yield more information, really... I can't say why
> >
> > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > [14079.678380] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > [14079.685456] #PF: supervisor read access in kernel mode
> > [14079.690686] #PF: error_code(0x0000) - not-present page
> > [14079.695915] PGD 0 P4D 0
> > [14079.698540] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [14079.702996] CPU: 11 PID: 891 Comm: napi/eno2-80 Not tainted 6.4.0 #360
> > [14079.709614] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [14079.717796] RIP: 0010:__udp_gso_segment (??:?)
> > [14079.722778] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: c3 ret
> > 1: 08 66 89 or %ah,-0x77(%rsi)
> > 4: 5c pop %rsp
> > 5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
> > c: 27 (bad)
> > d: fd std
> > e: ff (bad)
> > f: ff .byte 0xff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14079.741645] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14079.746966] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14079.754195] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14079.761422] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14079.768650] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14079.775879] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14079.783106] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14079.791305] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14079.797162] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14079.804408] Call Trace:
> > [14079.806961] <TASK>
> > [14079.809170] ? __die (??:?)
> > [14079.812340] ? page_fault_oops (fault.c:?)
> > [14079.816551] ? ip6_route_output_flags (??:?)
> > [14079.821284] ? exc_page_fault (??:?)
> > [14079.825408] ? update_load_avg (fair.c:?)
> > [14079.829534] ? asm_exc_page_fault (??:?)
> > [14079.833836] ? __udp_gso_segment (??:?)
> > [14079.838218] ? __udp_gso_segment (??:?)
> > [14079.842600] ? _raw_spin_unlock_irqrestore (??:?)
> > [14079.847679] ? try_to_wake_up (core.c:?)
> > [14079.851713] inet_gso_segment (??:?)
> > [14079.855827] ? vhost_poll_wakeup (vhost.c:?)
> > [14079.860032] skb_mac_gso_segment (??:?)
> > [14079.864331] __skb_gso_segment (??:?)
> > [14079.868455] ? netif_skb_features (??:?)
> > [14079.872928] validate_xmit_skb (dev.c:?)
> > [14079.877139] validate_xmit_skb_list (??:?)
> > [14079.881612] sch_direct_xmit (??:?)
> > [14079.885641] __qdisc_run (??:?)
> > [14079.889324] __dev_queue_xmit (??:?)
> > [14079.893450] ? nf_hook_slow (??:?)
> > [14079.897229] br_dev_queue_push_xmit (??:?)
> > [14079.901788] maybe_deliver (br_forward.c:?)
> > [14079.905564] br_flood (??:?)
> > [14079.908903] br_handle_frame_finish (??:?)
> > [14079.913547] br_handle_frame (br_input.c:?)
> > [14079.917585] ? ipv6_find_hdr (??:?)
> > [14079.921622] ? br_handle_local_finish (??:?)
> > [14079.926267] __netif_receive_skb_core.constprop.0 (dev.c:?)
> > [14079.932125] ? br_handle_frame_finish (br_input.c:?)
> > [14079.936946] ? ___slab_alloc (slub.c:?)
> > [14079.940986] __netif_receive_skb_list_core (dev.c:?)
> > [14079.946240] netif_receive_skb_list_internal (??:?)
> > [14079.951660] ? napi_gro_flush (??:?)
> > [14079.955604] napi_complete_done (??:?)
> > [14079.959808] ixgbe_poll (??:?)
> > [14079.963506] __napi_poll (dev.c:?)
> > [14079.967106] napi_threaded_poll (dev.c:?)
> > [14079.971405] ? __napi_poll (dev.c:?)
> > [14079.975260] kthread (kthread.c:?)
> > [14079.978510] ? kthread_complete_and_exit (kthread.c:?)
> > [14079.983415] ret_from_fork (??:?)
> > [14079.987102] </TASK>
> > [14079.989395] Modules linked in: chaoskey
> > [14079.993347] CR2: 00000000000000c0
> > [14079.996773] ---[ end trace 0000000000000000 ]---
> > [14080.018013] pstore: backend (erst) writing error (-28)
> > [14080.023274] RIP: 0010:__udp_gso_segment (??:?)
> > [14080.028264] Code: c3 08 66 89 5c 02 04 45 84 e4 0f 85 27 fd ff ff
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: c3 ret
> > 1: 08 66 89 or %ah,-0x77(%rsi)
> > 4: 5c pop %rsp
> > 5: 02 04 45 84 e4 0f 85 add -0x7af01b7c(,%rax,2),%al
> > c: 27 (bad)
> > d: fd std
> > e: ff (bad)
> > f: ff .byte 0xff
> > 49 8b 1e 49 8b ae c0 00 00 00 41 0f b7 86 b4 00 00 00 45 0f b7 a6 b2
> > 00 00 00 <48> 8b b3 c0 00 00 00 0f b7 8b b2 00 00 00 49 01 ec 48 01 c5
> > 48 8d
> > [14080.047181] RSP: 0018:ffffa83643a4f818 EFLAGS: 00010246
> > [14080.052522] RAX: 00000000000000ce RBX: 0000000000000000 RCX: 0000000000000000
> > [14080.059765] RDX: ffffa2ad1403b000 RSI: 0000000000000028 RDI: ffffa2afc9d302d4
> > [14080.067012] RBP: ffffa2ad1403b000 R08: 0000000000000022 R09: 00002000001558c9
> > [14080.074257] R10: 0000000000000000 R11: ffffa2b02fcea888 R12: 00000000000000e2
> > [14080.081502] R13: ffffa2afc9d30200 R14: ffffa2afc9d30200 R15: 00002000001558c9
> > [14080.088746] FS: 0000000000000000(0000) GS:ffffa2b02fcc0000(0000)
> > knlGS:0000000000000000
> > [14080.096964] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [14080.102823] CR2: 00000000000000c0 CR3: 0000000151ff4000 CR4: 00000000003526e0
> > [14080.110067] Kernel panic - not syncing: Fatal exception in interrupt
> > [14080.325501] Kernel Offset: 0x12600000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [14080.353129] ---[ end Kernel panic - not syncing: Fatal exception in
> > interrupt ]---
> >
> > The binaries aren't stripped so i don't, currently, know why it's like this...
> >
> > but i also get:
> > gdb vmlinux
> > GNU gdb (Gentoo 13.2 vanilla) 13.2
> > Copyright (C) 2023 Free Software Foundation, Inc.
> > License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> > This is free software: you are free to change and redistribute it.
> > There is NO WARRANTY, to the extent permitted by law.
> > Type "show copying" and "show warranty" for details.
> > This GDB was configured as "x86_64-pc-linux-gnu".
> > Type "show configuration" for configuration details.
> > For bug reporting instructions, please see:
> > <https://bugs.gentoo.org/>.
> > Find the GDB manual and other documentation resources online at:
> > <http://www.gnu.org/software/gdb/documentation/>.
> >
> > For help, type "help".
> > Type "apropos word" to search for commands related to "word"...
> > Reading symbols from vmlinux...
> > (No debugging symbols found in vmlinux)
> > Traceback (most recent call last):
> > File "/usr/src/linux/vmlinux-gdb.py", line 25, in <module>
> > import linux.constants
> > File "/usr/src/linux/scripts/gdb/linux/constants.py", line 10, in <module>
> > LX_hrtimer_resolution = gdb.parse_and_eval("hrtimer_resolution")
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > gdb.error: 'hrtimer_resolution' has unknown type; cast it to its declared type
> > ---
> >
> > > Thanks!
> > >
> > > Paolo
> > >

2023-06-27 09:32:37

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, 2023-06-26 at 20:59 +0200, Ian Kumlien wrote:
> On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <[email protected]> wrote:
> >
> > Nevermind, I think I found it, I will loop this thing until I have a
> > proper trace....
>
> Still some question marks, but much better

Thanks!
>
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> [ 62.631083] #PF: supervisor read access in kernel mode
> [ 62.636312] #PF: error_code(0x0000) - not-present page
> [ 62.641541] PGD 0 P4D 0
> [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
> [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
> [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [ 62.663344] RIP: 0010:__udp_gso_segment
> (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
> net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
> net/ipv4/udp_offload.c:277)

So it's faulting here:

static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs)
{
struct sk_buff *seg;
struct udphdr *uh, *uh2;
struct iphdr *iph, *iph2;

seg = segs;
uh = udp_hdr(seg);
iph = ip_hdr(seg);

if ((udp_hdr(seg)->dest == udp_hdr(seg->next)->dest) &&
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The GSO segment has been assembled by skb_gro_receive_list()
I guess seg->next is NULL, which is somewhat unexpected as
napi_gro_complete() clears the gso_size when sending up the stack a
single frame.

On the flip side, AFAICS, nothing prevents the stack from changing the
aggregated packet layout (e.g. pulling data and/or linearizing the
skb).

In any case this looks more related to rx-gro-list then rx-udp-gro-
forwarding. I understand you have both feature enabled in your env?

Side questions: do you have any non trivial nf/br filter rule?

The following could possibly validate the above and avoid the issue,
but it's a bit papering over it. Could you please try it in your env?

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..75531686bfdf 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4319,6 +4319,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb->prev = tail;

+ if (WARN_ON_ONCE(!skb->next))
+ goto err_linearize;
+
if (skb_needs_linearize(skb, features) &&
__skb_linearize(skb))
goto err_linearize;


2023-06-27 13:11:26

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, Jun 27, 2023 at 11:19 AM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2023-06-26 at 20:59 +0200, Ian Kumlien wrote:
> > On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <[email protected]> wrote:
> > >
> > > Nevermind, I think I found it, I will loop this thing until I have a
> > > proper trace....
> >
> > Still some question marks, but much better
>
> Thanks!
> >
> > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > [ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > [ 62.631083] #PF: supervisor read access in kernel mode
> > [ 62.636312] #PF: error_code(0x0000) - not-present page
> > [ 62.641541] PGD 0 P4D 0
> > [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
> > [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [ 62.663344] RIP: 0010:__udp_gso_segment
> > (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
> > net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
> > net/ipv4/udp_offload.c:277)
>
> So it's faulting here:
>
> static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs)
> {
> struct sk_buff *seg;
> struct udphdr *uh, *uh2;
> struct iphdr *iph, *iph2;
>
> seg = segs;
> uh = udp_hdr(seg);
> iph = ip_hdr(seg);
>
> if ((udp_hdr(seg)->dest == udp_hdr(seg->next)->dest) &&
> // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>
> The GSO segment has been assembled by skb_gro_receive_list()
> I guess seg->next is NULL, which is somewhat unexpected as
> napi_gro_complete() clears the gso_size when sending up the stack a
> single frame.
>
> On the flip side, AFAICS, nothing prevents the stack from changing the
> aggregated packet layout (e.g. pulling data and/or linearizing the
> skb).
>
> In any case this looks more related to rx-gro-list then rx-udp-gro-
> forwarding. I understand you have both feature enabled in your env?
>
> Side questions: do you have any non trivial nf/br filter rule?
>
> The following could possibly validate the above and avoid the issue,
> but it's a bit papering over it. Could you please try it in your env?

Will do as soon as i get home =)

> Thanks!
>
> Paolo
> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..75531686bfdf 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4319,6 +4319,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb->prev = tail;
>
> + if (WARN_ON_ONCE(!skb->next))
> + goto err_linearize;
> +
> if (skb_needs_linearize(skb, features) &&
> __skb_linearize(skb))
> goto err_linearize;
>

2023-06-28 08:44:41

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

Been running all night but eventually it crashed again...

[21753.055795] Out of memory: Killed process 970 (qemu-system-x86)
total-vm:4709488kB, anon-rss:2172652kB, file-rss:4608kB,
shmem-rss:0kB, UID:77 pgtables:4800kB oom_score_adj:0
[24249.061154] general protection fault, probably for non-canonical
address 0xb0746d4e6bee35e2: 0000 [#1] PREEMPT SMP NOPTI
[24249.072138] CPU: 0 PID: 893 Comm: napi/eno1-68 Tainted: G W
6.4.0-dirty #366
[24249.080670] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[24249.088852] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
[24249.094086] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
89 17
All code
========
0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
18: 47
19: 9c pushf
1a: 5e pop %rsi
1b: fa cli
1c: 45 31 d2 xor %r10d,%r10d
1f: eb 2f jmp 0x50
21: 8b 45 28 mov 0x28(%rbp),%eax
24: 48 01 d0 add %rdx,%rax
27: 48 89 c7 mov %rax,%rdi
2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
34: 48 0f cf bswap %rdi
37: 48 31 f8 xor %rdi,%rax
3a: 48 89 01 mov %rax,(%rcx)
3d: 49 89 17 mov %rdx,(%r15)

Code starting with the faulting instruction
===========================================
0: 48 8b 00 mov (%rax),%rax
3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
a: 48 0f cf bswap %rdi
d: 48 31 f8 xor %rdi,%rax
10: 48 89 01 mov %rax,(%rcx)
13: 49 89 17 mov %rdx,(%r15)
[24249.112951] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
[24249.118275] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
[24249.125501] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
[24249.132730] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
[24249.139958] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
[24249.147187] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
[24249.154415] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
knlGS:0000000000000000
[24249.162620] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[24249.168471] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
[24249.175717] Call Trace:
[24249.178268] <TASK>
[24249.180476] ? die_addr (arch/x86/kernel/dumpstack.c:421
arch/x86/kernel/dumpstack.c:460)
[24249.183907] ? exc_general_protection (arch/x86/kernel/traps.c:783
arch/x86/kernel/traps.c:728)
[24249.188726] ? asm_exc_general_protection
(./arch/x86/include/asm/idtentry.h:564)
[24249.193720] ? kmem_cache_alloc_bulk (mm/slub.c:377 mm/slub.c:388
mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
[24249.198361] ? netif_receive_skb_list_internal (net/core/dev.c:5729)
[24249.203960] napi_skb_cache_get (net/core/skbuff.c:338)
[24249.208078] __napi_build_skb (net/core/skbuff.c:517)
[24249.211934] napi_build_skb (net/core/skbuff.c:541)
[24249.215616] ixgbe_poll
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2165
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2361
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
[24249.219305] __napi_poll (net/core/dev.c:6498)
[24249.222905] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[24249.227197] ? __napi_poll (net/core/dev.c:6625)
[24249.231050] kthread (kernel/kthread.c:379)
[24249.234300] ? kthread_complete_and_exit (kernel/kthread.c:332)
[24249.239207] ret_from_fork (arch/x86/entry/entry_64.S:314)
[24249.242892] </TASK>
[24249.245185] Modules linked in: chaoskey
[24249.249133] ---[ end trace 0000000000000000 ]---
[24249.270157] pstore: backend (erst) writing error (-28)
[24249.275408] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
[24249.280660] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
89 17
All code
========
0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
18: 47
19: 9c pushf
1a: 5e pop %rsi
1b: fa cli
1c: 45 31 d2 xor %r10d,%r10d
1f: eb 2f jmp 0x50
21: 8b 45 28 mov 0x28(%rbp),%eax
24: 48 01 d0 add %rdx,%rax
27: 48 89 c7 mov %rax,%rdi
2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
34: 48 0f cf bswap %rdi
37: 48 31 f8 xor %rdi,%rax
3a: 48 89 01 mov %rax,(%rcx)
3d: 49 89 17 mov %rdx,(%r15)

Code starting with the faulting instruction
===========================================
0: 48 8b 00 mov (%rax),%rax
3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
a: 48 0f cf bswap %rdi
d: 48 31 f8 xor %rdi,%rax
10: 48 89 01 mov %rax,(%rcx)
13: 49 89 17 mov %rdx,(%r15)
[24249.299578] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
[24249.304917] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
[24249.312161] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
[24249.319407] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
[24249.326651] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
[24249.333896] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
[24249.341141] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
knlGS:0000000000000000
[24249.349356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[24249.355206] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
[24249.362452] Kernel panic - not syncing: Fatal exception in interrupt
[24249.566854] Kernel Offset: 0x36e00000 from 0xffffffff81000000
(relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[24249.594124] ---[ end Kernel panic - not syncing: Fatal exception in
interrupt ]---

It's also odd that i get a OOM - it only seems to happen when i enable
rx-gro-list - it's also odd because this machine always has ~8GB of
memory available

On Tue, Jun 27, 2023 at 2:31 PM Ian Kumlien <[email protected]> wrote:
>
> On Tue, Jun 27, 2023 at 11:19 AM Paolo Abeni <[email protected]> wrote:
> >
> > On Mon, 2023-06-26 at 20:59 +0200, Ian Kumlien wrote:
> > > On Mon, Jun 26, 2023 at 8:20 PM Ian Kumlien <[email protected]> wrote:
> > > >
> > > > Nevermind, I think I found it, I will loop this thing until I have a
> > > > proper trace....
> > >
> > > Still some question marks, but much better
> >
> > Thanks!
> > >
> > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > [ 62.624003] BUG: kernel NULL pointer dereference, address: 00000000000000c0
> > > [ 62.631083] #PF: supervisor read access in kernel mode
> > > [ 62.636312] #PF: error_code(0x0000) - not-present page
> > > [ 62.641541] PGD 0 P4D 0
> > > [ 62.644174] Oops: 0000 [#1] PREEMPT SMP NOPTI
> > > [ 62.648629] CPU: 1 PID: 913 Comm: napi/eno2-79 Not tainted 6.4.0 #364
> > > [ 62.655162] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > > BIOS 1.7a 10/13/2022
> > > [ 62.663344] RIP: 0010:__udp_gso_segment
> > > (./include/linux/skbuff.h:2858 ./include/linux/udp.h:23
> > > net/ipv4/udp_offload.c:228 net/ipv4/udp_offload.c:261
> > > net/ipv4/udp_offload.c:277)
> >
> > So it's faulting here:
> >
> > static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs)
> > {
> > struct sk_buff *seg;
> > struct udphdr *uh, *uh2;
> > struct iphdr *iph, *iph2;
> >
> > seg = segs;
> > uh = udp_hdr(seg);
> > iph = ip_hdr(seg);
> >
> > if ((udp_hdr(seg)->dest == udp_hdr(seg->next)->dest) &&
> > // ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> >
> > The GSO segment has been assembled by skb_gro_receive_list()
> > I guess seg->next is NULL, which is somewhat unexpected as
> > napi_gro_complete() clears the gso_size when sending up the stack a
> > single frame.
> >
> > On the flip side, AFAICS, nothing prevents the stack from changing the
> > aggregated packet layout (e.g. pulling data and/or linearizing the
> > skb).
> >
> > In any case this looks more related to rx-gro-list then rx-udp-gro-
> > forwarding. I understand you have both feature enabled in your env?
> >
> > Side questions: do you have any non trivial nf/br filter rule?
> >
> > The following could possibly validate the above and avoid the issue,
> > but it's a bit papering over it. Could you please try it in your env?
>
> Will do as soon as i get home =)
>
> > Thanks!
> >
> > Paolo
> > ---
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 6c5915efbc17..75531686bfdf 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -4319,6 +4319,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb->prev = tail;
> >
> > + if (WARN_ON_ONCE(!skb->next))
> > + goto err_linearize;
> > +
> > if (skb_needs_linearize(skb, features) &&
> > __skb_linearize(skb))
> > goto err_linearize;
> >

2023-06-28 09:48:55

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

Hello,

On Wed, 2023-06-28 at 09:37 +0200, Ian Kumlien wrote:
> Been running all night but eventually it crashed again...
>
> [21753.055795] Out of memory: Killed process 970 (qemu-system-x86)
> total-vm:4709488kB, anon-rss:2172652kB, file-rss:4608kB,
> shmem-rss:0kB, UID:77 pgtables:4800kB oom_score_adj:0
> [24249.061154] general protection fault, probably for non-canonical
> address 0xb0746d4e6bee35e2: 0000 [#1] PREEMPT SMP NOPTI
> [24249.072138] CPU: 0 PID: 893 Comm: napi/eno1-68 Tainted: G W
> 6.4.0-dirty #366
> [24249.080670] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> BIOS 1.7a 10/13/2022
> [24249.088852] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> [24249.094086] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> 89 17
> All code
> ========
> 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> 18: 47
> 19: 9c pushf
> 1a: 5e pop %rsi
> 1b: fa cli
> 1c: 45 31 d2 xor %r10d,%r10d
> 1f: eb 2f jmp 0x50
> 21: 8b 45 28 mov 0x28(%rbp),%eax
> 24: 48 01 d0 add %rdx,%rax
> 27: 48 89 c7 mov %rax,%rdi
> 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> 34: 48 0f cf bswap %rdi
> 37: 48 31 f8 xor %rdi,%rax
> 3a: 48 89 01 mov %rax,(%rcx)
> 3d: 49 89 17 mov %rdx,(%r15)
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 8b 00 mov (%rax),%rax
> 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> a: 48 0f cf bswap %rdi
> d: 48 31 f8 xor %rdi,%rax
> 10: 48 89 01 mov %rax,(%rcx)
> 13: 49 89 17 mov %rdx,(%r15)
> [24249.112951] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> [24249.118275] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> [24249.125501] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> [24249.132730] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> [24249.139958] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> [24249.147187] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> [24249.154415] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> knlGS:0000000000000000
> [24249.162620] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [24249.168471] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> [24249.175717] Call Trace:
> [24249.178268] <TASK>
> [24249.180476] ? die_addr (arch/x86/kernel/dumpstack.c:421
> arch/x86/kernel/dumpstack.c:460)
> [24249.183907] ? exc_general_protection (arch/x86/kernel/traps.c:783
> arch/x86/kernel/traps.c:728)
> [24249.188726] ? asm_exc_general_protection
> (./arch/x86/include/asm/idtentry.h:564)
> [24249.193720] ? kmem_cache_alloc_bulk (mm/slub.c:377 mm/slub.c:388
> mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> [24249.198361] ? netif_receive_skb_list_internal (net/core/dev.c:5729)
> [24249.203960] napi_skb_cache_get (net/core/skbuff.c:338)
> [24249.208078] __napi_build_skb (net/core/skbuff.c:517)
> [24249.211934] napi_build_skb (net/core/skbuff.c:541)
> [24249.215616] ixgbe_poll
> (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2165
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2361
> drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
> [24249.219305] __napi_poll (net/core/dev.c:6498)
> [24249.222905] napi_threaded_poll (./include/linux/netpoll.h:89
> net/core/dev.c:6640)
> [24249.227197] ? __napi_poll (net/core/dev.c:6625)
> [24249.231050] kthread (kernel/kthread.c:379)
> [24249.234300] ? kthread_complete_and_exit (kernel/kthread.c:332)
> [24249.239207] ret_from_fork (arch/x86/entry/entry_64.S:314)
> [24249.242892] </TASK>
> [24249.245185] Modules linked in: chaoskey
> [24249.249133] ---[ end trace 0000000000000000 ]---
> [24249.270157] pstore: backend (erst) writing error (-28)
> [24249.275408] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> [24249.280660] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> 89 17
> All code
> ========
> 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> 18: 47
> 19: 9c pushf
> 1a: 5e pop %rsi
> 1b: fa cli
> 1c: 45 31 d2 xor %r10d,%r10d
> 1f: eb 2f jmp 0x50
> 21: 8b 45 28 mov 0x28(%rbp),%eax
> 24: 48 01 d0 add %rdx,%rax
> 27: 48 89 c7 mov %rax,%rdi
> 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> 34: 48 0f cf bswap %rdi
> 37: 48 31 f8 xor %rdi,%rax
> 3a: 48 89 01 mov %rax,(%rcx)
> 3d: 49 89 17 mov %rdx,(%r15)
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 8b 00 mov (%rax),%rax
> 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> a: 48 0f cf bswap %rdi
> d: 48 31 f8 xor %rdi,%rax
> 10: 48 89 01 mov %rax,(%rcx)
> 13: 49 89 17 mov %rdx,(%r15)
> [24249.299578] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> [24249.304917] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> [24249.312161] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> [24249.319407] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> [24249.326651] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> [24249.333896] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> [24249.341141] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> knlGS:0000000000000000
> [24249.349356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [24249.355206] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> [24249.362452] Kernel panic - not syncing: Fatal exception in interrupt
> [24249.566854] Kernel Offset: 0x36e00000 from 0xffffffff81000000
> (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> [24249.594124] ---[ end Kernel panic - not syncing: Fatal exception in
> interrupt ]---
>
> It's also odd that i get a OOM - it only seems to happen when i enable
> rx-gro-list 

Unfortunately, not the result I was looking for. That leads to more
questions then answer, I'm sorry.

How long did the host keep going with rx-gro-list enabled?

Did you observe the WARN_ON() introduced by the tentative fix?

> - it's also odd because this machine always has ~8GB of
> memory available

It looks like there is a memory leak somewhere, and I don't think the
tentative fixup introduced such issue.

It looks like the above splat is due to a slab corruption, which in
turn could be unrelated from the mentioned leak, but it could/should
be related to rx-gro-list.

Could you please run the test with both kmemleak and kasan enabled?

Additionally could you please disclose if you have non trivial
netfilter and/or bridge filter and/or tc rules possibly modifying the
incoming/egress packets?

If kasan is not an option, could you please apply the debug the patch
below? (on top of the previous one)

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..94adca27b205 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4295,6 +4295,8 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
delta_len += nskb->len;

skb_push(nskb, -skb_network_offset(nskb) + offset);
+ if (WARN_ON_ONCE(nskb->data - skb->head > skb->tail))
+ goto err_linearize;

skb_release_head_state(nskb);
len_diff = skb_network_header_len(nskb) - skb_network_header_len(skb);
@@ -4302,6 +4304,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb_headers_offset_update(nskb, skb_headroom(nskb) - skb_headroom(skb));
nskb->transport_header += len_diff;
+ if (WARN_ON_ONCE(tnl_hlen > skb_headroom(nskb)))
+ goto err_linearize;
+ if (WARN_ON_ONCE(skb_headroom(nskb) + offset > nskb->tail))
+ goto err_linearize;
+
skb_copy_from_linear_data_offset(skb, -tnl_hlen,
nskb->data - tnl_hlen,
offset + tnl_hlen);



2023-06-28 12:23:16

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, Jun 28, 2023 at 11:06 AM Paolo Abeni <[email protected]> wrote:
>
> Hello,
>
> On Wed, 2023-06-28 at 09:37 +0200, Ian Kumlien wrote:
> > Been running all night but eventually it crashed again...
> >
> > [21753.055795] Out of memory: Killed process 970 (qemu-system-x86)
> > total-vm:4709488kB, anon-rss:2172652kB, file-rss:4608kB,
> > shmem-rss:0kB, UID:77 pgtables:4800kB oom_score_adj:0
> > [24249.061154] general protection fault, probably for non-canonical
> > address 0xb0746d4e6bee35e2: 0000 [#1] PREEMPT SMP NOPTI
> > [24249.072138] CPU: 0 PID: 893 Comm: napi/eno1-68 Tainted: G W
> > 6.4.0-dirty #366
> > [24249.080670] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > BIOS 1.7a 10/13/2022
> > [24249.088852] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> > mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > [24249.094086] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> > 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> > 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> > 89 17
> > All code
> > ========
> > 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> > 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> > d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> > 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> > 18: 47
> > 19: 9c pushf
> > 1a: 5e pop %rsi
> > 1b: fa cli
> > 1c: 45 31 d2 xor %r10d,%r10d
> > 1f: eb 2f jmp 0x50
> > 21: 8b 45 28 mov 0x28(%rbp),%eax
> > 24: 48 01 d0 add %rdx,%rax
> > 27: 48 89 c7 mov %rax,%rdi
> > 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> > 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > 34: 48 0f cf bswap %rdi
> > 37: 48 31 f8 xor %rdi,%rax
> > 3a: 48 89 01 mov %rax,(%rcx)
> > 3d: 49 89 17 mov %rdx,(%r15)
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 48 8b 00 mov (%rax),%rax
> > 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > a: 48 0f cf bswap %rdi
> > d: 48 31 f8 xor %rdi,%rax
> > 10: 48 89 01 mov %rax,(%rcx)
> > 13: 49 89 17 mov %rdx,(%r15)
> > [24249.112951] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> > [24249.118275] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> > [24249.125501] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> > [24249.132730] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> > [24249.139958] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> > [24249.147187] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> > [24249.154415] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> > knlGS:0000000000000000
> > [24249.162620] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [24249.168471] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> > [24249.175717] Call Trace:
> > [24249.178268] <TASK>
> > [24249.180476] ? die_addr (arch/x86/kernel/dumpstack.c:421
> > arch/x86/kernel/dumpstack.c:460)
> > [24249.183907] ? exc_general_protection (arch/x86/kernel/traps.c:783
> > arch/x86/kernel/traps.c:728)
> > [24249.188726] ? asm_exc_general_protection
> > (./arch/x86/include/asm/idtentry.h:564)
> > [24249.193720] ? kmem_cache_alloc_bulk (mm/slub.c:377 mm/slub.c:388
> > mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > [24249.198361] ? netif_receive_skb_list_internal (net/core/dev.c:5729)
> > [24249.203960] napi_skb_cache_get (net/core/skbuff.c:338)
> > [24249.208078] __napi_build_skb (net/core/skbuff.c:517)
> > [24249.211934] napi_build_skb (net/core/skbuff.c:541)
> > [24249.215616] ixgbe_poll
> > (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2165
> > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2361
> > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
> > [24249.219305] __napi_poll (net/core/dev.c:6498)
> > [24249.222905] napi_threaded_poll (./include/linux/netpoll.h:89
> > net/core/dev.c:6640)
> > [24249.227197] ? __napi_poll (net/core/dev.c:6625)
> > [24249.231050] kthread (kernel/kthread.c:379)
> > [24249.234300] ? kthread_complete_and_exit (kernel/kthread.c:332)
> > [24249.239207] ret_from_fork (arch/x86/entry/entry_64.S:314)
> > [24249.242892] </TASK>
> > [24249.245185] Modules linked in: chaoskey
> > [24249.249133] ---[ end trace 0000000000000000 ]---
> > [24249.270157] pstore: backend (erst) writing error (-28)
> > [24249.275408] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> > mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > [24249.280660] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> > 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> > 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> > 89 17
> > All code
> > ========
> > 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> > 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> > d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> > 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> > 18: 47
> > 19: 9c pushf
> > 1a: 5e pop %rsi
> > 1b: fa cli
> > 1c: 45 31 d2 xor %r10d,%r10d
> > 1f: eb 2f jmp 0x50
> > 21: 8b 45 28 mov 0x28(%rbp),%eax
> > 24: 48 01 d0 add %rdx,%rax
> > 27: 48 89 c7 mov %rax,%rdi
> > 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> > 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > 34: 48 0f cf bswap %rdi
> > 37: 48 31 f8 xor %rdi,%rax
> > 3a: 48 89 01 mov %rax,(%rcx)
> > 3d: 49 89 17 mov %rdx,(%r15)
> >
> > Code starting with the faulting instruction
> > ===========================================
> > 0: 48 8b 00 mov (%rax),%rax
> > 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > a: 48 0f cf bswap %rdi
> > d: 48 31 f8 xor %rdi,%rax
> > 10: 48 89 01 mov %rax,(%rcx)
> > 13: 49 89 17 mov %rdx,(%r15)
> > [24249.299578] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> > [24249.304917] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> > [24249.312161] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> > [24249.319407] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> > [24249.326651] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> > [24249.333896] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> > [24249.341141] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> > knlGS:0000000000000000
> > [24249.349356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [24249.355206] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> > [24249.362452] Kernel panic - not syncing: Fatal exception in interrupt
> > [24249.566854] Kernel Offset: 0x36e00000 from 0xffffffff81000000
> > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > [24249.594124] ---[ end Kernel panic - not syncing: Fatal exception in
> > interrupt ]---
> >
> > It's also odd that i get a OOM - it only seems to happen when i enable
> > rx-gro-list
>
> Unfortunately, not the result I was looking for. That leads to more
> questions then answer, I'm sorry.

I understand you...

> How long did the host keep going with rx-gro-list enabled?

Well, hours...

reboot system boot 6.4.0-dirty Wed Jun 28 04:20 - 13:39 (09:19)
reboot system boot 6.4.0-dirty Tue Jun 27 21:31 - 13:39 (16:08)

So, lets imagine a few seconds to login and enable everything

> Did you observe the WARN_ON() introduced by the tentative fix?

I could only see the console, so saw nothing...

> > - it's also odd because this machine always has ~8GB of
> > memory available
>
> It looks like there is a memory leak somewhere, and I don't think the
> tentative fixup introduced such issue.

I agree, it was there before...

> It looks like the above splat is due to a slab corruption, which in
> turn could be unrelated from the mentioned leak, but it could/should
> be related to rx-gro-list.

Agreed =)

> Could you please run the test with both kmemleak and kasan enabled?

Machine-slowdown-enabled^tm

> Additionally could you please disclose if you have non trivial
> netfilter and/or bridge filter and/or tc rules possibly modifying the
> incoming/egress packets?

I only have basic reject accept rules, some snat/dnat pairs, but i
don't see it ending up in "non trivial" ;)

> If kasan is not an option, could you please apply the debug the patch
> below? (on top of the previous one)

I actually did both, if it's unrelated we should know as well..

I hope i have something for you before tomorrow, else there will be a
bit of a break until next week

> Thanks!
>
> Paolo
> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..94adca27b205 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4295,6 +4295,8 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> delta_len += nskb->len;
>
> skb_push(nskb, -skb_network_offset(nskb) + offset);
> + if (WARN_ON_ONCE(nskb->data - skb->head > skb->tail))
> + goto err_linearize;
>
> skb_release_head_state(nskb);
> len_diff = skb_network_header_len(nskb) - skb_network_header_len(skb);
> @@ -4302,6 +4304,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb_headers_offset_update(nskb, skb_headroom(nskb) - skb_headroom(skb));
> nskb->transport_header += len_diff;
> + if (WARN_ON_ONCE(tnl_hlen > skb_headroom(nskb)))
> + goto err_linearize;
> + if (WARN_ON_ONCE(skb_headroom(nskb) + offset > nskb->tail))
> + goto err_linearize;
> +
> skb_copy_from_linear_data_offset(skb, -tnl_hlen,
> nskb->data - tnl_hlen,
> offset + tnl_hlen);
>
>

2023-06-28 12:37:13

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

So have some hits, would it be better without your warn on? ... Things
are a bit slow atm - lets just say that i noticed the stacktraces
because a stream stuttered =)

cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[ 100.136018] ------------[ cut here ]------------
[ 100.136044] WARNING: CPU: 2 PID: 911 at net/core/skbuff.c:4307
skb_segment_list (net/core/skbuff.c:4307)
[ 100.136085] Modules linked in: chaoskey
[ 100.136113] CPU: 2 PID: 911 Comm: napi/eno1-67 Not tainted 6.4.0-dirty #367
[ 100.136135] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 100.136148] RIP: 0010:skb_segment_list (net/core/skbuff.c:4307)
[ 100.136169] Code: e9 21 fe ff ff 48 8b ac 24 a0 00 00 00 89 3c 24 e8
8e 5b c9 fd 8b 34 24 48 c7 c1 00 bc 3e 99 4c 89 ef 48 89 ea e8 19 97
fd ff <0f> 0b 48 b8 00 00 00 00 00 fc ff df 4c 89 fa 48 c1 ea 03 80 3c
02
All code
========
0: e9 21 fe ff ff jmp 0xfffffffffffffe26
5: 48 8b ac 24 a0 00 00 mov 0xa0(%rsp),%rbp
c: 00
d: 89 3c 24 mov %edi,(%rsp)
10: e8 8e 5b c9 fd call 0xfffffffffdc95ba3
15: 8b 34 24 mov (%rsp),%esi
18: 48 c7 c1 00 bc 3e 99 mov $0xffffffff993ebc00,%rcx
1f: 4c 89 ef mov %r13,%rdi
22: 48 89 ea mov %rbp,%rdx
25: e8 19 97 fd ff call 0xfffffffffffd9743
2a:* 0f 0b ud2 <-- trapping instruction
2c: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
33: fc ff df
36: 4c 89 fa mov %r15,%rdx
39: 48 c1 ea 03 shr $0x3,%rdx
3d: 80 .byte 0x80
3e: 3c 02 cmp $0x2,%al

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
9: fc ff df
c: 4c 89 fa mov %r15,%rdx
f: 48 c1 ea 03 shr $0x3,%rdx
13: 80 .byte 0x80
14: 3c 02 cmp $0x2,%al
[ 100.136188] RSP: 0018:ffff88811eea6fb0 EFLAGS: 00010212
[ 100.136208] RAX: 00000000000005cc RBX: ffff88814b0da000 RCX: ffffffff97d7acb7
[ 100.136222] RDX: ffff888221044474 RSI: 1ffff11044208891 RDI: 000000000000002a
[ 100.136236] RBP: 00000000000020c0 R08: 0000000000000000 R09: ffff888221044497
[ 100.136248] R10: ffffed1044208892 R11: 0000000000000014 R12: ffff888221044480
[ 100.136261] R13: ffff8882210443c0 R14: dffffc0000000000 R15: ffff88811a6472c0
[ 100.136275] FS: 0000000000000000(0000) GS:ffff88842f300000(0000)
knlGS:0000000000000000
[ 100.136289] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 100.136303] CR2: 0000000000000000 CR3: 0000000120900000 CR4: 00000000003526e0
[ 100.136315] Call Trace:
[ 100.136327] <TASK>
[ 100.136339] ? __warn (kernel/panic.c:673)
[ 100.136361] ? skb_segment_list (net/core/skbuff.c:4307)
[ 100.136379] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 100.136400] ? handle_bug (arch/x86/kernel/traps.c:324)
[ 100.136419] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[ 100.136439] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 100.136462] ? skb_segment_list (./arch/x86/include/asm/atomic.h:29
./include/linux/atomic/atomic-instrumented.h:28
./include/linux/refcount.h:147 ./include/linux/skbuff.h:1986
net/core/skbuff.c:4281)
[ 100.136482] ? skb_segment_list (net/core/skbuff.c:4307)
[ 100.136503] __udp_gso_segment (net/ipv4/udp_offload.c:255
net/ipv4/udp_offload.c:277)
[ 100.136525] ? nft_masq_init (net/netfilter/nft_masq.c:102)
[ 100.136542] ? ixgbe_xdp_xmit
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:8718)
[ 100.136563] inet_gso_segment (net/ipv4/af_inet.c:1399)
[ 100.136582] ? skb_crc32c_csum_help (./include/linux/skbuff.h:2698
./include/linux/skbuff.h:2956 net/core/dev.c:3303)
[ 100.136604] skb_mac_gso_segment (net/core/gro.c:141)
[ 100.136624] ? skb_eth_gso_segment (net/core/gro.c:127)
[ 100.136645] __skb_gso_segment (net/core/dev.c:3403 (discriminator 2))
[ 100.136663] ? netif_skb_features (net/core/dev.c:3474 net/core/dev.c:3563)
[ 100.136683] validate_xmit_skb (./include/linux/netdevice.h:4862
net/core/dev.c:3659)
[ 100.136704] validate_xmit_skb_list (net/core/dev.c:3710)
[ 100.136725] sch_direct_xmit (net/sched/sch_generic.c:330)
[ 100.136745] ? qdisc_put_unlocked (net/sched/sch_generic.c:317)
[ 100.136762] ? _raw_spin_trylock (./arch/x86/include/asm/atomic.h:29
./include/linux/atomic/atomic-instrumented.h:28
./include/asm-generic/qspinlock.h:92 ./include/linux/spinlock.h:192
./include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138)
[ 100.136783] ? _raw_spin_lock_irqsave (kernel/locking/spinlock.c:137)
[ 100.136835] __dev_queue_xmit (net/core/dev.c:3805 net/core/dev.c:4210)
[ 100.136862] ? ip_finish_output2 (net/ipv4/ip_output.c:196)
[ 100.136883] ? netdev_core_pick_tx (net/core/dev.c:4151)
[ 100.136907] ? ip_setup_cork (net/ipv4/ip_output.c:196)
[ 100.136927] ? __ip_finish_output (net/ipv4/ip_output.c:250
net/ipv4/ip_output.c:302 net/ipv4/ip_output.c:289)
[ 100.136945] ? eth_header (net/ethernet/eth.c:100)
[ 100.136966] ? neigh_resolve_output
(./include/linux/netdevice.h:3140 net/core/neighbour.c:1547
net/core/neighbour.c:1532)
[ 100.136988] neigh_xmit (net/core/neighbour.c:3156)
[ 100.137007] nf_flow_offload_ip_hook (net/netfilter/nf_flow_table_ip.c:418)
[ 100.137032] ? nf_flow_queue_xmit (net/netfilter/nf_flow_table_ip.c:342)
[ 100.137054] ? consume_skb (./arch/x86/include/asm/atomic.h:190
./include/linux/atomic/atomic-instrumented.h:177
./include/linux/refcount.h:272 ./include/linux/refcount.h:315
./include/linux/refcount.h:333 ./include/linux/skbuff.h:1221
net/core/skbuff.c:1240)
[ 100.137071] nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 100.137094] __netif_receive_skb_core.constprop.0
(./include/linux/netfilter_netdev.h:34 net/core/dev.c:5274
net/core/dev.c:5361)
[ 100.137120] ? do_xdp_generic (net/core/dev.c:5281)
[ 100.137142] ? __udp4_lib_lookup (net/ipv4/udp.c:531)
[ 100.137164] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 100.137188] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 100.137211] ? load_balance (kernel/sched/fair.c:10908)
[ 100.137230] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 100.137250] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 100.137272] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 100.137295] ? process_backlog (net/core/dev.c:5699)
[ 100.137317] ? napi_gro_complete.constprop.0 (net/core/gro.c:321)
[ 100.137338] ? dev_gro_receive (./arch/x86/include/asm/bitops.h:94
(discriminator 8)
./include/asm-generic/bitops/instrumented-non-atomic.h:45
(discriminator 8) net/core/gro.c:583 (discriminator 8))
[ 100.137357] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[ 100.137378] ? napi_busy_loop (net/core/dev.c:6034)
[ 100.137399] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[ 100.137425] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 100.137447] ? io_schedule_timeout (kernel/sched/core.c:6551)
[ 100.137469] __napi_poll (net/core/dev.c:6498)
[ 100.137490] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 100.137513] ? __napi_poll (net/core/dev.c:6625)
[ 100.137531] ? migrate_enable (kernel/sched/core.c:3045)
[ 100.137553] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 100.137572] ? __napi_poll (net/core/dev.c:6625)
[ 100.137591] kthread (kernel/kthread.c:379)
[ 100.137610] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 100.137631] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 100.137651] </TASK>
[ 100.137661] ---[ end trace 0000000000000000 ]---

[ 112.103156] ------------[ cut here ]------------
[ 112.103183] WARNING: CPU: 4 PID: 922 at net/core/skbuff.c:4337
skb_segment_list (net/core/skbuff.c:4337 (discriminator 1))
[ 112.103222] Modules linked in: chaoskey
[ 112.103251] CPU: 4 PID: 922 Comm: napi/eno2-80 Tainted: G W
6.4.0-dirty #367
[ 112.103273] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 112.103286] RIP: 0010:skb_segment_list (net/core/skbuff.c:4337
(discriminator 1))
[ 112.103308] Code: 41 0f c1 87 d4 00 00 00 85 c0 74 25 8d 50 01 09 c2
78 08 4c 89 f8 e9 28 fa ff ff be 01 00 00 00 48 89 df e8 63 70 a1 fe
eb e9 <0f> 0b e9 df f9 ff ff be 02 00 00 00 48 89 df e8 4d 70 a1 fe eb
d3
All code
========
0: 41 0f c1 87 d4 00 00 xadd %eax,0xd4(%r15)
7: 00
8: 85 c0 test %eax,%eax
a: 74 25 je 0x31
c: 8d 50 01 lea 0x1(%rax),%edx
f: 09 c2 or %eax,%edx
11: 78 08 js 0x1b
13: 4c 89 f8 mov %r15,%rax
16: e9 28 fa ff ff jmp 0xfffffffffffffa43
1b: be 01 00 00 00 mov $0x1,%esi
20: 48 89 df mov %rbx,%rdi
23: e8 63 70 a1 fe call 0xfffffffffea1708b
28: eb e9 jmp 0x13
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 df f9 ff ff jmp 0xfffffffffffffa10
31: be 02 00 00 00 mov $0x2,%esi
36: 48 89 df mov %rbx,%rdi
39: e8 4d 70 a1 fe call 0xfffffffffea1708b
3e: eb d3 jmp 0x13

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 df f9 ff ff jmp 0xfffffffffffff9e6
7: be 02 00 00 00 mov $0x2,%esi
c: 48 89 df mov %rbx,%rdi
f: e8 4d 70 a1 fe call 0xfffffffffea17061
14: eb d3 jmp 0xffffffffffffffe9
[ 112.103326] RSP: 0018:ffff88811c93ec38 EFLAGS: 00010246
[ 112.103346] RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff888218b92848
[ 112.103361] RDX: 1ffff110431724f0 RSI: ffff888218b92834 RDI: 0000000000000000
[ 112.103374] RBP: ffff8881804b6ec0 R08: ffff888218b92840 R09: 1ffff110431724fe
[ 112.103388] R10: ffff8881804b6000 R11: 0000000000000014 R12: 0000000000000000
[ 112.103400] R13: ffff8881804b6ec0 R14: 0000000000000022 R15: ffff888218b92780
[ 112.103414] FS: 0000000000000000(0000) GS:ffff88842f400000(0000)
knlGS:0000000000000000
[ 112.103429] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 112.103442] CR2: 00007fd1aa0419e5 CR3: 00000001287ea000 CR4: 00000000003526e0
[ 112.103456] Call Trace:
[ 112.103467] <TASK>
[ 112.103521] ? __warn (kernel/panic.c:673)
[ 112.103549] ? skb_segment_list (net/core/skbuff.c:4337 (discriminator 1))
[ 112.103569] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 112.103590] ? handle_bug (arch/x86/kernel/traps.c:324)
[ 112.103611] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[ 112.103631] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 112.103658] ? skb_segment_list (net/core/skbuff.c:4337 (discriminator 1))
[ 112.103678] ? set_track_prepare (mm/slub.c:5682)
[ 112.103696] ? napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[ 112.103716] ? pcpu_alloc (mm/percpu-internal.h:129 mm/percpu.c:1880)
[ 112.103734] __udp_gso_segment (net/ipv4/udp_offload.c:255
net/ipv4/udp_offload.c:277)
[ 112.103758] ? _raw_spin_lock_irqsave
(./arch/x86/include/asm/atomic.h:202
./include/linux/atomic/atomic-instrumented.h:543
./include/asm-generic/qspinlock.h:111 ./include/linux/spinlock.h:186
./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[ 112.103782] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[ 112.103804] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 112.103826] ? netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 112.103848] inet_gso_segment (net/ipv4/af_inet.c:1399)
[ 112.103868] ? skb_crc32c_csum_help (./include/linux/skbuff.h:2698
./include/linux/skbuff.h:2956 net/core/dev.c:3303)
[ 112.103891] skb_mac_gso_segment (net/core/gro.c:141)
[ 112.103911] ? skb_eth_gso_segment (net/core/gro.c:127)
[ 112.103933] __skb_gso_segment (net/core/dev.c:3403 (discriminator 2))
[ 112.103952] ? netif_skb_features (net/core/dev.c:3474 net/core/dev.c:3563)
[ 112.103973] validate_xmit_skb (./include/linux/netdevice.h:4862
net/core/dev.c:3659)
[ 112.103993] ? kasan_save_stack (mm/kasan/common.c:47)
[ 112.104017] validate_xmit_skb_list (net/core/dev.c:3710)
[ 112.104039] sch_direct_xmit (net/sched/sch_generic.c:330)
[ 112.104058] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[ 112.104075] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:195
arch/x86/kernel/unwind_orc.c:469)
[ 112.104098] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 112.104115] ? qdisc_put_unlocked (net/sched/sch_generic.c:317)
[ 112.104133] ? _raw_spin_trylock (./arch/x86/include/asm/atomic.h:29
./include/linux/atomic/atomic-instrumented.h:28
./include/asm-generic/qspinlock.h:92 ./include/linux/spinlock.h:192
./include/linux/spinlock_api_smp.h:89 kernel/locking/spinlock.c:138)
[ 112.104154] ? _raw_spin_lock_irqsave (kernel/locking/spinlock.c:137)
[ 112.104178] __dev_queue_xmit (net/core/dev.c:3805 net/core/dev.c:4210)
[ 112.104200] ? filter_irq_stacks (kernel/stacktrace.c:114)
[ 112.104222] ? netdev_core_pick_tx (net/core/dev.c:4151)
[ 112.104242] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:381
arch/x86/kernel/unwind_orc.c:623)
[ 112.104264] ? i8237A_resume (./arch/x86/include/asm/dma.h:250
arch/x86/kernel/i8237.c:33)
[ 112.104282] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[ 112.104298] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:195
arch/x86/kernel/unwind_orc.c:469)
[ 112.104320] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 112.104337] ? br_handle_frame_finish (net/bridge/br_input.c:215)
[ 112.104359] ? write_profile (kernel/stacktrace.c:86)
[ 112.104379] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:24)
[ 112.104398] br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 112.104421] ? stack_trace_save (kernel/stacktrace.c:123)
[ 112.104442] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 112.104464] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 112.104510] br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[ 112.104536] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 112.104558] ? maybe_deliver (net/bridge/br_forward.c:125
net/bridge/br_forward.c:189)
[ 112.104577] ? br_flood (net/bridge/br_forward.c:233)
[ 112.104596] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 112.104617] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 112.104639] __br_forward (./include/linux/netfilter.h:304
./include/linux/netfilter.h:297 net/bridge/br_forward.c:115)
[ 112.104660] ? br_forward_finish (net/bridge/br_forward.c:75)
[ 112.104682] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 112.104703] ? __copy_skb_header (./include/net/dst.h:297
net/core/skbuff.c:1338)
[ 112.104725] ? __skb_clone (./arch/x86/include/asm/atomic.h:95
(discriminator 4) ./include/linux/atomic/atomic-instrumented.h:191
(discriminator 4) net/core/skbuff.c:1409 (discriminator 4))
[ 112.104746] maybe_deliver (net/bridge/br_forward.c:193)
[ 112.104766] ? br_fdb_update (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
net/bridge/br_fdb.c:896)
[ 112.104787] br_flood (net/bridge/br_forward.c:233)
[ 112.104809] br_handle_frame_finish (net/bridge/br_input.c:215)
[ 112.104832] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 112.104855] ? br_cfm_config_fill_info
(./include/linux/skbuff.h:2527 ./include/net/netlink.h:1815
./include/net/netlink.h:1835 net/bridge/br_cfm_netlink.c:462)
[ 112.104874] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 112.104893] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:381
arch/x86/kernel/unwind_orc.c:623)
[ 112.104915] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:24)
[ 112.104933] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[ 112.104949] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:195
arch/x86/kernel/unwind_orc.c:469)
[ 112.104970] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 112.104987] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 112.105006] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 112.105028] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 112.105050] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 112.105071] ? packet_rcv (net/packet/af_packet.c:2231)
[ 112.105090] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 112.105112] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 112.105135] ? do_xdp_generic (net/core/dev.c:5281)
[ 112.105154] ? udp4_lib_lookup2 (net/ipv4/udp.c:456)
[ 112.105175] ? queued_spin_lock_slowpath
(kernel/locking/qspinlock.c:183 kernel/locking/qspinlock.c:463)
[ 112.105193] ? __udp4_lib_lookup (net/ipv4/udp.c:531)
[ 112.105215] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 112.105239] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 112.105262] ? load_balance (kernel/sched/fair.c:10908)
[ 112.105281] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 112.105302] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 112.105323] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 112.105346] ? process_backlog (net/core/dev.c:5699)
[ 112.105368] ? napi_gro_flush (./arch/x86/include/asm/bitops.h:94
./include/asm-generic/bitops/instrumented-non-atomic.h:45
net/core/gro.c:346 net/core/gro.c:361)
[ 112.105386] ? dev_gro_receive (./arch/x86/include/asm/bitops.h:68
(discriminator 8)
./include/asm-generic/bitops/instrumented-non-atomic.h:29
(discriminator 8) net/core/gro.c:581 (discriminator 8))
[ 112.105405] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[ 112.105425] ? napi_busy_loop (net/core/dev.c:6034)
[ 112.105447] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[ 112.105468] ? attach_entity_load_avg (kernel/sched/pelt.h:44
kernel/sched/fair.c:4162)
[ 112.105514] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 112.105544] __napi_poll (net/core/dev.c:6498)
[ 112.105566] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 112.105589] ? __napi_poll (net/core/dev.c:6625)
[ 112.105608] ? migrate_enable (kernel/sched/core.c:3045)
[ 112.105630] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 112.105649] ? __napi_poll (net/core/dev.c:6625)
[ 112.105668] kthread (kernel/kthread.c:379)
[ 112.105687] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 112.105708] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 112.105729] </TASK>
[ 112.105739] ---[ end trace 0000000000000000 ]---

On Wed, Jun 28, 2023 at 1:47 PM Ian Kumlien <[email protected]> wrote:
>
> On Wed, Jun 28, 2023 at 11:06 AM Paolo Abeni <[email protected]> wrote:
> >
> > Hello,
> >
> > On Wed, 2023-06-28 at 09:37 +0200, Ian Kumlien wrote:
> > > Been running all night but eventually it crashed again...
> > >
> > > [21753.055795] Out of memory: Killed process 970 (qemu-system-x86)
> > > total-vm:4709488kB, anon-rss:2172652kB, file-rss:4608kB,
> > > shmem-rss:0kB, UID:77 pgtables:4800kB oom_score_adj:0
> > > [24249.061154] general protection fault, probably for non-canonical
> > > address 0xb0746d4e6bee35e2: 0000 [#1] PREEMPT SMP NOPTI
> > > [24249.072138] CPU: 0 PID: 893 Comm: napi/eno1-68 Tainted: G W
> > > 6.4.0-dirty #366
> > > [24249.080670] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
> > > BIOS 1.7a 10/13/2022
> > > [24249.088852] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> > > mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > > [24249.094086] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> > > 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> > > 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> > > 89 17
> > > All code
> > > ========
> > > 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> > > 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> > > d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> > > 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> > > 18: 47
> > > 19: 9c pushf
> > > 1a: 5e pop %rsi
> > > 1b: fa cli
> > > 1c: 45 31 d2 xor %r10d,%r10d
> > > 1f: eb 2f jmp 0x50
> > > 21: 8b 45 28 mov 0x28(%rbp),%eax
> > > 24: 48 01 d0 add %rdx,%rax
> > > 27: 48 89 c7 mov %rax,%rdi
> > > 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> > > 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > > 34: 48 0f cf bswap %rdi
> > > 37: 48 31 f8 xor %rdi,%rax
> > > 3a: 48 89 01 mov %rax,(%rcx)
> > > 3d: 49 89 17 mov %rdx,(%r15)
> > >
> > > Code starting with the faulting instruction
> > > ===========================================
> > > 0: 48 8b 00 mov (%rax),%rax
> > > 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > > a: 48 0f cf bswap %rdi
> > > d: 48 31 f8 xor %rdi,%rax
> > > 10: 48 89 01 mov %rax,(%rcx)
> > > 13: 49 89 17 mov %rdx,(%r15)
> > > [24249.112951] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> > > [24249.118275] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> > > [24249.125501] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> > > [24249.132730] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> > > [24249.139958] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> > > [24249.147187] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> > > [24249.154415] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> > > knlGS:0000000000000000
> > > [24249.162620] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [24249.168471] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> > > [24249.175717] Call Trace:
> > > [24249.178268] <TASK>
> > > [24249.180476] ? die_addr (arch/x86/kernel/dumpstack.c:421
> > > arch/x86/kernel/dumpstack.c:460)
> > > [24249.183907] ? exc_general_protection (arch/x86/kernel/traps.c:783
> > > arch/x86/kernel/traps.c:728)
> > > [24249.188726] ? asm_exc_general_protection
> > > (./arch/x86/include/asm/idtentry.h:564)
> > > [24249.193720] ? kmem_cache_alloc_bulk (mm/slub.c:377 mm/slub.c:388
> > > mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > > [24249.198361] ? netif_receive_skb_list_internal (net/core/dev.c:5729)
> > > [24249.203960] napi_skb_cache_get (net/core/skbuff.c:338)
> > > [24249.208078] __napi_build_skb (net/core/skbuff.c:517)
> > > [24249.211934] napi_build_skb (net/core/skbuff.c:541)
> > > [24249.215616] ixgbe_poll
> > > (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2165
> > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2361
> > > drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
> > > [24249.219305] __napi_poll (net/core/dev.c:6498)
> > > [24249.222905] napi_threaded_poll (./include/linux/netpoll.h:89
> > > net/core/dev.c:6640)
> > > [24249.227197] ? __napi_poll (net/core/dev.c:6625)
> > > [24249.231050] kthread (kernel/kthread.c:379)
> > > [24249.234300] ? kthread_complete_and_exit (kernel/kthread.c:332)
> > > [24249.239207] ret_from_fork (arch/x86/entry/entry_64.S:314)
> > > [24249.242892] </TASK>
> > > [24249.245185] Modules linked in: chaoskey
> > > [24249.249133] ---[ end trace 0000000000000000 ]---
> > > [24249.270157] pstore: backend (erst) writing error (-28)
> > > [24249.275408] RIP: 0010:kmem_cache_alloc_bulk (mm/slub.c:377
> > > mm/slub.c:388 mm/slub.c:395 mm/slub.c:3963 mm/slub.c:4026)
> > > [24249.280660] Code: 0f 84 46 ff ff ff 65 ff 05 a4 bd e4 47 48 8b 4d
> > > 00 65 48 03 0d e8 5f e3 47 9c 5e fa 45 31 d2 eb 2f 8b 45 28 48 01 d0
> > > 48 89 c7 <48> 8b 00 48 33 85 b8 00 00 00 48 0f cf 48 31 f8 48 89 01 49
> > > 89 17
> > > All code
> > > ========
> > > 0: 0f 84 46 ff ff ff je 0xffffffffffffff4c
> > > 6: 65 ff 05 a4 bd e4 47 incl %gs:0x47e4bda4(%rip) # 0x47e4bdb1
> > > d: 48 8b 4d 00 mov 0x0(%rbp),%rcx
> > > 11: 65 48 03 0d e8 5f e3 add %gs:0x47e35fe8(%rip),%rcx # 0x47e36001
> > > 18: 47
> > > 19: 9c pushf
> > > 1a: 5e pop %rsi
> > > 1b: fa cli
> > > 1c: 45 31 d2 xor %r10d,%r10d
> > > 1f: eb 2f jmp 0x50
> > > 21: 8b 45 28 mov 0x28(%rbp),%eax
> > > 24: 48 01 d0 add %rdx,%rax
> > > 27: 48 89 c7 mov %rax,%rdi
> > > 2a:* 48 8b 00 mov (%rax),%rax <-- trapping instruction
> > > 2d: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > > 34: 48 0f cf bswap %rdi
> > > 37: 48 31 f8 xor %rdi,%rax
> > > 3a: 48 89 01 mov %rax,(%rcx)
> > > 3d: 49 89 17 mov %rdx,(%r15)
> > >
> > > Code starting with the faulting instruction
> > > ===========================================
> > > 0: 48 8b 00 mov (%rax),%rax
> > > 3: 48 33 85 b8 00 00 00 xor 0xb8(%rbp),%rax
> > > a: 48 0f cf bswap %rdi
> > > d: 48 31 f8 xor %rdi,%rax
> > > 10: 48 89 01 mov %rax,(%rcx)
> > > 13: 49 89 17 mov %rdx,(%r15)
> > > [24249.299578] RSP: 0018:ffff9fc303973d20 EFLAGS: 00010086
> > > [24249.304917] RAX: b0746d4e6bee35e2 RBX: 0000000000000001 RCX: ffff8d5a2fa31da0
> > > [24249.312161] RDX: b0746d4e6bee3572 RSI: 0000000000000286 RDI: b0746d4e6bee35e2
> > > [24249.319407] RBP: ffff8d56c016d500 R08: 0000000000000400 R09: ffff8d56ede0e67a
> > > [24249.326651] R10: 0000000000000001 R11: ffff8d56c59d88c0 R12: 0000000000000010
> > > [24249.333896] R13: 0000000000000820 R14: ffff8d5a2fa2a810 R15: ffff8d5a2fa2a818
> > > [24249.341141] FS: 0000000000000000(0000) GS:ffff8d5a2fa00000(0000)
> > > knlGS:0000000000000000
> > > [24249.349356] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [24249.355206] CR2: 00007f0f3f7f8760 CR3: 0000000102466000 CR4: 00000000003526f0
> > > [24249.362452] Kernel panic - not syncing: Fatal exception in interrupt
> > > [24249.566854] Kernel Offset: 0x36e00000 from 0xffffffff81000000
> > > (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
> > > [24249.594124] ---[ end Kernel panic - not syncing: Fatal exception in
> > > interrupt ]---
> > >
> > > It's also odd that i get a OOM - it only seems to happen when i enable
> > > rx-gro-list
> >
> > Unfortunately, not the result I was looking for. That leads to more
> > questions then answer, I'm sorry.
>
> I understand you...
>
> > How long did the host keep going with rx-gro-list enabled?
>
> Well, hours...
>
> reboot system boot 6.4.0-dirty Wed Jun 28 04:20 - 13:39 (09:19)
> reboot system boot 6.4.0-dirty Tue Jun 27 21:31 - 13:39 (16:08)
>
> So, lets imagine a few seconds to login and enable everything
>
> > Did you observe the WARN_ON() introduced by the tentative fix?
>
> I could only see the console, so saw nothing...
>
> > > - it's also odd because this machine always has ~8GB of
> > > memory available
> >
> > It looks like there is a memory leak somewhere, and I don't think the
> > tentative fixup introduced such issue.
>
> I agree, it was there before...
>
> > It looks like the above splat is due to a slab corruption, which in
> > turn could be unrelated from the mentioned leak, but it could/should
> > be related to rx-gro-list.
>
> Agreed =)
>
> > Could you please run the test with both kmemleak and kasan enabled?
>
> Machine-slowdown-enabled^tm
>
> > Additionally could you please disclose if you have non trivial
> > netfilter and/or bridge filter and/or tc rules possibly modifying the
> > incoming/egress packets?
>
> I only have basic reject accept rules, some snat/dnat pairs, but i
> don't see it ending up in "non trivial" ;)
>
> > If kasan is not an option, could you please apply the debug the patch
> > below? (on top of the previous one)
>
> I actually did both, if it's unrelated we should know as well..
>
> I hope i have something for you before tomorrow, else there will be a
> bit of a break until next week
>
> > Thanks!
> >
> > Paolo
> > ---
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index 6c5915efbc17..94adca27b205 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -4295,6 +4295,8 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > delta_len += nskb->len;
> >
> > skb_push(nskb, -skb_network_offset(nskb) + offset);
> > + if (WARN_ON_ONCE(nskb->data - skb->head > skb->tail))
> > + goto err_linearize;
> >
> > skb_release_head_state(nskb);
> > len_diff = skb_network_header_len(nskb) - skb_network_header_len(skb);
> > @@ -4302,6 +4304,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb_headers_offset_update(nskb, skb_headroom(nskb) - skb_headroom(skb));
> > nskb->transport_header += len_diff;
> > + if (WARN_ON_ONCE(tnl_hlen > skb_headroom(nskb)))
> > + goto err_linearize;
> > + if (WARN_ON_ONCE(skb_headroom(nskb) + offset > nskb->tail))
> > + goto err_linearize;
> > +
> > skb_copy_from_linear_data_offset(skb, -tnl_hlen,
> > nskb->data - tnl_hlen,
> > offset + tnl_hlen);
> >
> >

2023-06-28 15:50:52

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, 2023-06-28 at 14:04 +0200, Ian Kumlien wrote:
> So have some hits, would it be better without your warn on? ... Things
> are a bit slow atm - lets just say that i noticed the stacktraces
> because a stream stuttered =)

Sorry, I screwed-up completely a newly added check.

If you have Kasan enabled you can simply and more safely remove my 2nd
patch. Kasan should be able to catch all the out-of-buffer scenarios
such checks were intended to prevent.

Cheers,

Paolo


2023-06-28 20:37:05

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, Jun 28, 2023 at 5:15 PM Paolo Abeni <[email protected]> wrote:
>
> On Wed, 2023-06-28 at 14:04 +0200, Ian Kumlien wrote:
> > So have some hits, would it be better without your warn on? ... Things
> > are a bit slow atm - lets just say that i noticed the stacktraces
> > because a stream stuttered =)
>
> Sorry, I screwed-up completely a newly added check.

Thats ok

> If you have Kasan enabled you can simply and more safely remove my 2nd
> patch. Kasan should be able to catch all the out-of-buffer scenarios
> such checks were intended to prevent.

I thought I'd run without any of the patches, preparing for that now,
but i have to stop testing tomorrow and will continue on monday if i
don't catch anything

> Cheers,
>
> Paolo
>

2023-06-29 11:10:28

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, Jun 28, 2023 at 10:18 PM Ian Kumlien <[email protected]> wrote:
>
> On Wed, Jun 28, 2023 at 5:15 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Wed, 2023-06-28 at 14:04 +0200, Ian Kumlien wrote:
> > > So have some hits, would it be better without your warn on? ... Things
> > > are a bit slow atm - lets just say that i noticed the stacktraces
> > > because a stream stuttered =)
> >
> > Sorry, I screwed-up completely a newly added check.
>
> Thats ok
>
> > If you have Kasan enabled you can simply and more safely remove my 2nd
> > patch. Kasan should be able to catch all the out-of-buffer scenarios
> > such checks were intended to prevent.
>
> I thought I'd run without any of the patches, preparing for that now,
> but i have to stop testing tomorrow and will continue on monday if i
> don't catch anything

So, KASAN caught the null pointer derefs, as expected, but it caught
two of them which i didn't expect.

Anyway, I'm off for the weekend so, I hope to be able to send
something better on Monday, fyi

> > Cheers,
> >
> > Paolo
> >

2023-07-03 09:47:16

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

So, got back, switched to 6.4.1 and reran with kmemleak and kasan

I got the splat from:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cea28d30abb5..701c1b5cf532 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4328,6 +4328,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb->prev = tail;

+ if (WARN_ON_ONCE(!skb->next))
+ goto err_linearize;
+
if (skb_needs_linearize(skb, features) &&
__skb_linearize(skb))
goto err_linearize;

I'm just happy i ran with dmesg -W since there was only minimal output
on the console:
[39914.833696] rcu: INFO: rcu_preempt self-detected stall on CPU
[39914.839598] rcu: 2-....: (20997 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=4687
[39914.849839] rcu: (t=21017 jiffies g=18175157 q=45473 ncpus=12)
[39977.862108] rcu: INFO: rcu_preempt self-detected stall on CPU
[39977.868002] rcu: 2-....: (84001 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=28434
[39977.878340] rcu: (t=84047 jiffies g=18175157 q=263477 ncpus=12)
[40040.892521] rcu: INFO: rcu_preempt self-detected stall on CPU
[40040.898414] rcu: 2-....: (147006 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=53043
[40040.908831] rcu: (t=147079 jiffies g=18175157 q=464422 ncpus=12)
[40065.080842] ixgbe 0000:06:00.1 eno2: Reset adapter

And in dmesg -W i got:
cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[39914.833696] rcu: INFO: rcu_preempt self-detected stall on CPU
[39914.839598] rcu: 2-....: (20997 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=4687
[39914.849839] rcu: (t=21017 jiffies g=18175157 q=45473 ncpus=12)
[39914.855892] CPU: 2 PID: 913 Comm: napi/eno2-84 Tainted: G W
6.4.1-dirty #372
[39914.855921] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[39914.855938] RIP: 0010:_raw_spin_unlock_irqrestore
(./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
[39914.855980] Code: cc 66 0f 1f 84 00 00 00 00 00 55 48 89 f5 53 48
89 fb e8 03 72 88 fe c6 03 00 f7 c5 00 02 00 00 74 01 fb 65 ff 0d d8
27 d0 5c <74> 07 5b 5d c3 cc cc cc cc 0f 1f 44 00 00 5b 5d c3 cc cc cc
cc 66
All code
========
0: cc int3
1: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
8: 00 00
a: 55 push %rbp
b: 48 89 f5 mov %rsi,%rbp
e: 53 push %rbx
f: 48 89 fb mov %rdi,%rbx
12: e8 03 72 88 fe call 0xfffffffffe88721a
17: c6 03 00 movb $0x0,(%rbx)
1a: f7 c5 00 02 00 00 test $0x200,%ebp
20: 74 01 je 0x23
22: fb sti
23: 65 ff 0d d8 27 d0 5c decl %gs:0x5cd027d8(%rip) # 0x5cd02802
2a:* 74 07 je 0x33 <-- trapping instruction
2c: 5b pop %rbx
2d: 5d pop %rbp
2e: c3 ret
2f: cc int3
30: cc int3
31: cc int3
32: cc int3
33: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
38: 5b pop %rbx
39: 5d pop %rbp
3a: c3 ret
3b: cc int3
3c: cc int3
3d: cc int3
3e: cc int3
3f: 66 data16

Code starting with the faulting instruction
===========================================
0: 74 07 je 0x9
2: 5b pop %rbx
3: 5d pop %rbp
4: c3 ret
5: cc int3
6: cc int3
7: cc int3
8: cc int3
9: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
e: 5b pop %rbx
f: 5d pop %rbp
10: c3 ret
11: cc int3
12: cc int3
13: cc int3
14: cc int3
15: 66 data16
[39914.856006] RSP: 0018:ffff888109f9e9d0 EFLAGS: 00000206
[39914.856034] RAX: 0000000000000004 RBX: ffffffffa55812e0 RCX: ffffffffa3334d1d
[39914.856054] RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffffffffa55812e0
[39914.856073] RBP: 0000000000000246 R08: 0000000000000000 R09: 0000000000000003
[39914.856090] R10: fffffbfff4ab025c R11: 00000000000080fe R12: 0000000000000000
[39914.856108] R13: 0000000000082820 R14: 0000000000000240 R15: 0000000000000000
[39914.856126] FS: 0000000000000000(0000) GS:ffff8883ef700000(0000)
knlGS:0000000000000000
[39914.856147] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[39914.856165] CR2: 00007f21303f298c CR3: 00000002ab8a6000 CR4: 00000000003526e0
[39914.856185] Call Trace:
[39914.856201] <IRQ>
[39914.856218] ? rcu_dump_cpu_stacks (kernel/rcu/tree_stall.h:372)
[39914.856252] ? rcu_sched_clock_irq (kernel/rcu/tree_stall.h:692
kernel/rcu/tree_stall.h:774 kernel/rcu/tree.c:3822
kernel/rcu/tree.c:2214)
[39914.856284] ? resched_curr (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
./include/linux/thread_info.h:118 ./include/linux/sched.h:2050
./include/linux/sched.h:2065 kernel/sched/core.c:1048)
[39914.856312] ? wake_q_add_safe (kernel/sched/core.c:1042)
[39914.856339] ? rcu_note_context_switch (kernel/rcu/tree.c:2193)
[39914.856370] ? clear_buddies (kernel/sched/fair.c:4922)
[39914.856393] ? run_posix_cpu_timers
(./include/linux/sched/deadline.h:15
./include/linux/sched/deadline.h:22
kernel/time/posix-cpu-timers.c:1155
kernel/time/posix-cpu-timers.c:1451)
[39914.856422] ? clear_posix_cputimers_work
(kernel/time/posix-cpu-timers.c:1435)
[39914.856451] ? cpuacct_account_field (./include/linux/cgroup.h:437
kernel/sched/cpuacct.c:39 kernel/sched/cpuacct.c:354)
[39914.856480] ? hrtimer_run_queues (kernel/time/hrtimer.c:1900)
[39914.856509] ? update_process_times
(./arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2073)
[39914.856536] ? tick_sched_handle (kernel/time/tick-sched.c:255)
[39914.856566] ? tick_sched_timer (kernel/time/tick-sched.c:1497)
[39914.856596] ? tick_sched_do_timer (kernel/time/tick-sched.c:1479)
[39914.856626] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1685
kernel/time/hrtimer.c:1749)
[39914.856653] ? netif_tx_stop_all_queues (net/core/dev.c:5992)
[39914.856688] ? enqueue_hrtimer (kernel/time/hrtimer.c:1719)
[39914.856714] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[39914.856746] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39914.856773] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39914.856800] ? ktime_get_update_offsets_now
(kernel/time/timekeeping.c:2323 (discriminator 3))
[39914.856831] ? hrtimer_interrupt (kernel/time/hrtimer.c:1814)
[39914.856863] ? __sysvec_apic_timer_interrupt
(./arch/x86/include/asm/jump_label.h:27
./include/linux/jump_label.h:207
./arch/x86/include/asm/trace/irq_vectors.h:41
arch/x86/kernel/apic/apic.c:1113)
[39914.856893] ? sysvec_apic_timer_interrupt
(arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[39914.856919] </IRQ>
[39914.856932] <TASK>
[39914.856945] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39914.856983] ? _raw_spin_unlock_irqrestore
(./include/asm-generic/qspinlock.h:128 ./include/linux/spinlock.h:203
./include/linux/spinlock_api_smp.h:150 kernel/locking/spinlock.c:194)
[39914.857017] ? _raw_spin_unlock_irqrestore
(./include/linux/spinlock_api_smp.h:152 kernel/locking/spinlock.c:194)
[39914.857050] kmem_cache_alloc_node (./include/linux/kmemleak.h:42
mm/slab.h:714 mm/slub.c:3451 mm/slub.c:3496)
[39914.857086] kmalloc_reserve (net/core/skbuff.c:571)
[39914.857118] __alloc_skb (net/core/skbuff.c:654)
[39914.857148] ? __napi_build_skb (net/core/skbuff.c:627)
[39914.857180] ? skb_segment (net/core/skbuff.c:4531)
[39914.857204] ? skb_segment (./include/linux/skbuff.h:2577
net/core/skbuff.c:4527)
[39914.857231] skb_segment (net/core/skbuff.c:4519)
[39914.857257] ? write_profile (kernel/stacktrace.c:83)
[39914.857296] ? pskb_extract (net/core/skbuff.c:4360)
[39914.857320] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
[39914.857346] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
[39914.857379] __udp_gso_segment (net/ipv4/udp_offload.c:290)
[39914.857413] ? ip6_dst_destroy (net/ipv6/route.c:788)
[39914.857442] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39914.857472] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
[39914.857498] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
[39914.857528] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
net/ipv6/ip6_offload.c:74)
[39914.857557] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
[39914.857583] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
[39914.857612] ? fib6_select_path (net/ipv6/route.c:458)
[39914.857643] skb_mac_gso_segment (net/core/gro.c:141)
[39914.857673] ? skb_eth_gso_segment (net/core/gro.c:127)
[39914.857702] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
[39914.857726] ? kasan_save_stack (mm/kasan/common.c:47)
[39914.857758] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
[39914.857787] udpv6_queue_rcv_skb (./include/net/udp.h:492
net/ipv6/udp.c:796 net/ipv6/udp.c:787)
[39914.857816] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[39914.857846] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437
(discriminator 4))
[39914.857884] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[39914.857913] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[39914.857941] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[39914.857970] ? ip6_route_del (net/ipv6/route.c:4013)
[39914.857998] ? ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:479)
[39914.858028] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[39914.858055] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[39914.858086] ip6_mc_input (net/ipv6/ip6_input.c:591)
[39914.858116] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[39914.858151] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[39914.858179] ? ip6_input (net/ipv6/ip6_input.c:303)
[39914.858206] ? stack_trace_save (kernel/stacktrace.c:123)
[39914.858236] ? ipv6_list_rcv (net/ipv6/ip6_input.c:70)
[39914.858268] ? ip6_input (net/ipv6/ip6_input.c:303)
[39914.858294] __netif_receive_skb_one_core (net/core/dev.c:5486)
[39914.858326] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[39914.858358] ? br_nf_dev_queue_xmit (net/bridge/br_netfilter_hooks.c:820)
[39914.858387] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[39914.858417] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[39914.858447] ? __netif_receive_skb (net/core/dev.c:5747)
[39914.858476] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[39914.858507] ? br_nf_forward_ip (net/bridge/br_netfilter_hooks.c:647)
[39914.858533] ? nf_hook_slow (net/netfilter/core.c:625)
[39914.858562] ? br_handle_vlan (net/bridge/br_vlan.c:483)
[39914.858592] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[39914.858622] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[39914.858655] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[39914.858690] br_handle_frame_finish (net/bridge/br_input.c:216)
[39914.858724] ? br_handle_local_finish (net/bridge/br_input.c:75)
[39914.858755] ? br_cfm_config_fill_info (net/bridge/br_cfm_netlink.c:510)
[39914.858784] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[39914.858813] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[39914.858837] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:655
(discriminator 3))
[39914.858868] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[39914.858893] ? write_profile (kernel/stacktrace.c:83)
[39914.858924] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[39914.858953] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[39914.858983] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[39914.859013] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[39914.859042] ? br_handle_local_finish (net/bridge/br_input.c:75)
[39914.859072] ? packet_rcv (net/packet/af_packet.c:2231)
[39914.859100] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[39914.859131] ? virtio_net_hdr_to_skb.constprop.0 (drivers/net/tun.c:753)
[39914.859161] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[39914.859194] ? udp_lib_lport_inuse (net/ipv4/udp.c:152)
[39914.859224] ? do_xdp_generic (net/core/dev.c:5281)
[39914.859254] ? udp4_lib_lookup2 (net/ipv4/udp.c:449 (discriminator 9))
[39914.859286] ? __udp4_lib_lookup (net/ipv4/udp.c:531)
[39914.859317] __netif_receive_skb_list_core (net/core/dev.c:5570)
[39914.859351] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[39914.859382] ? udp_push_pending_frames (net/ipv4/udp.c:495)
[39914.859415] ? kmem_cache_alloc_bulk (mm/slub.c:4033)
[39914.859445] ? napi_skb_cache_get (net/core/skbuff.c:338)
[39914.859474] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39914.859503] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[39914.859538] ? process_backlog (net/core/dev.c:5699)
[39914.859565] ? udp4_gro_complete (net/ipv4/udp_offload.c:714)
[39914.859595] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:423)
[39914.859622] ? napi_gro_complete.constprop.0
(./include/net/gro.h:444 net/core/gro.c:328)
[39914.859653] ? napi_gro_flush (./arch/x86/include/asm/bitops.h:94
./include/asm-generic/bitops/instrumented-non-atomic.h:45
net/core/gro.c:346 net/core/gro.c:361)
[39914.859683] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[39914.859714] ? napi_busy_loop (net/core/dev.c:6034)
[39914.859746] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[39914.859785] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[39914.859815] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39914.859849] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39914.859887] __napi_poll (net/core/dev.c:6498)
[39914.859917] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[39914.859949] ? __napi_poll (net/core/dev.c:6625)
[39914.859976] ? migrate_enable (kernel/sched/core.c:3045)
[39914.860009] ? __napi_poll (net/core/dev.c:6625)
[39914.860037] kthread (kernel/kthread.c:379)
[39914.860064] ? kthread_complete_and_exit (kernel/kthread.c:332)
[39914.860095] ret_from_fork (arch/x86/entry/entry_64.S:314)
[39914.860126] </TASK>
[39977.862108] rcu: INFO: rcu_preempt self-detected stall on CPU
[39977.868002] rcu: 2-....: (84001 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=28434
[39977.878340] rcu: (t=84047 jiffies g=18175157 q=263477 ncpus=12)
[39977.884486] CPU: 2 PID: 913 Comm: napi/eno2-84 Tainted: G W
6.4.1-dirty #372
[39977.884520] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[39977.884537] RIP: 0010:__orc_find (arch/x86/kernel/unwind_orc.c:52)
[39977.884577] Code: 00 00 00 41 57 89 d0 41 56 41 55 4c 8d 6c 87 fc
41 54 55 53 48 83 ec 08 48 89 34 24 85 d2 0f 84 81 00 00 00 49 89 fc
49 39 fd <0f> 82 8a 00 00 00 48 89 cb 48 89 fd 49 89 ff eb 0c 4d 8d 7e
04 4c
All code
========
0: 00 00 add %al,(%rax)
2: 00 41 57 add %al,0x57(%rcx)
5: 89 d0 mov %edx,%eax
7: 41 56 push %r14
9: 41 55 push %r13
b: 4c 8d 6c 87 fc lea -0x4(%rdi,%rax,4),%r13
10: 41 54 push %r12
12: 55 push %rbp
13: 53 push %rbx
14: 48 83 ec 08 sub $0x8,%rsp
18: 48 89 34 24 mov %rsi,(%rsp)
1c: 85 d2 test %edx,%edx
1e: 0f 84 81 00 00 00 je 0xa5
24: 49 89 fc mov %rdi,%r12
27: 49 39 fd cmp %rdi,%r13
2a:* 0f 82 8a 00 00 00 jb 0xba <-- trapping instruction
30: 48 89 cb mov %rcx,%rbx
33: 48 89 fd mov %rdi,%rbp
36: 49 89 ff mov %rdi,%r15
39: eb 0c jmp 0x47
3b: 4d 8d 7e 04 lea 0x4(%r14),%r15
3f: 4c rex.WR

Code starting with the faulting instruction
===========================================
0: 0f 82 8a 00 00 00 jb 0x90
6: 48 89 cb mov %rcx,%rbx
9: 48 89 fd mov %rdi,%rbp
c: 49 89 ff mov %rdi,%r15
f: eb 0c jmp 0x1d
11: 4d 8d 7e 04 lea 0x4(%r14),%r15
15: 4c rex.WR
[39977.884602] RSP: 0018:ffff888109f9e518 EFLAGS: 00000246
[39977.884631] RAX: 0000000000000001 RBX: ffff888109f9e5c8 RCX: ffffffffa3169519
[39977.884652] RDX: 0000000000000001 RSI: ffffffffa51081fa RDI: ffffffffa4c4a7a4
[39977.884671] RBP: 00000000000bbad1 R08: ffffffffa5108200 R09: ffffffffa50de9d7
[39977.884690] R10: 0000000000020001 R11: 00000000000080fe R12: ffffffffa4c4a7a4
[39977.884709] R13: ffffffffa4c4a7a4 R14: ffff888109f9e5fd R15: ffffffffa3169519
[39977.884729] FS: 0000000000000000(0000) GS:ffff8883ef700000(0000)
knlGS:0000000000000000
[39977.884750] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[39977.884768] CR2: 00007f21303f298c CR3: 00000002ab8a6000 CR4: 00000000003526e0
[39977.884788] Call Trace:
[39977.884804] <IRQ>
[39977.884821] ? rcu_dump_cpu_stacks (kernel/rcu/tree_stall.h:372)
[39977.884855] ? rcu_sched_clock_irq (kernel/rcu/tree_stall.h:692
kernel/rcu/tree_stall.h:774 kernel/rcu/tree.c:3822
kernel/rcu/tree.c:2214)
[39977.884888] ? resched_curr (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
./include/linux/thread_info.h:118 ./include/linux/sched.h:2050
./include/linux/sched.h:2065 kernel/sched/core.c:1048)
[39977.884915] ? wake_q_add_safe (kernel/sched/core.c:1042)
[39977.884942] ? rcu_note_context_switch (kernel/rcu/tree.c:2193)
[39977.884972] ? clear_buddies (kernel/sched/fair.c:4922)
[39977.884995] ? run_posix_cpu_timers
(./include/linux/sched/deadline.h:15
./include/linux/sched/deadline.h:22
kernel/time/posix-cpu-timers.c:1155
kernel/time/posix-cpu-timers.c:1451)
[39977.885023] ? clear_posix_cputimers_work
(kernel/time/posix-cpu-timers.c:1435)
[39977.885052] ? cpuacct_account_field (./include/linux/cgroup.h:437
kernel/sched/cpuacct.c:39 kernel/sched/cpuacct.c:354)
[39977.885081] ? hrtimer_run_queues (kernel/time/hrtimer.c:1900)
[39977.885110] ? update_process_times
(./arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2073)
[39977.885136] ? tick_sched_handle (kernel/time/tick-sched.c:255)
[39977.885166] ? tick_sched_timer (kernel/time/tick-sched.c:1497)
[39977.885195] ? tick_sched_do_timer (kernel/time/tick-sched.c:1479)
[39977.885225] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1685
kernel/time/hrtimer.c:1749)
[39977.885251] ? netif_tx_stop_all_queues (net/core/dev.c:5992)
[39977.885286] ? enqueue_hrtimer (kernel/time/hrtimer.c:1719)
[39977.885311] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[39977.885344] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39977.885371] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39977.885397] ? ktime_get_update_offsets_now
(kernel/time/timekeeping.c:2323 (discriminator 3))
[39977.885428] ? hrtimer_interrupt (kernel/time/hrtimer.c:1814)
[39977.885459] ? __sysvec_apic_timer_interrupt
(./arch/x86/include/asm/jump_label.h:27
./include/linux/jump_label.h:207
./arch/x86/include/asm/trace/irq_vectors.h:41
arch/x86/kernel/apic/apic.c:1113)
[39977.885490] ? sysvec_apic_timer_interrupt
(arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[39977.885515] </IRQ>
[39977.885528] <TASK>
[39977.885542] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39977.885575] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.885604] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.885632] ? __orc_find (arch/x86/kernel/unwind_orc.c:52)
[39977.885662] ? stack_access_ok
(./arch/x86/include/asm/stacktrace.h:60
arch/x86/kernel/unwind_orc.c:368)
[39977.885692] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.885717] unwind_next_frame (arch/x86/kernel/unwind_orc.c:195
arch/x86/kernel/unwind_orc.c:469)
[39977.885748] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.885775] ? write_profile (kernel/stacktrace.c:83)
[39977.885805] arch_stack_walk (arch/x86/kernel/stacktrace.c:24)
[39977.885834] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.885862] stack_trace_save (kernel/stacktrace.c:123)
[39977.885892] ? filter_irq_stacks (kernel/stacktrace.c:114)
[39977.885923] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[39977.885949] kasan_save_stack (mm/kasan/common.c:46)
[39977.885980] ? kasan_save_stack (mm/kasan/common.c:46)
[39977.886008] ? kasan_set_track (mm/kasan/common.c:52)
[39977.886036] ? __kasan_slab_alloc (mm/kasan/common.c:328)
[39977.886065] ? kmem_cache_alloc (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3459 mm/slub.c:3466 mm/slub.c:3475)
[39977.886092] ? __create_object (mm/kmemleak.c:458 mm/kmemleak.c:635)
[39977.886116] ? kmem_cache_alloc_node (./include/linux/kmemleak.h:42
mm/slab.h:714 mm/slub.c:3451 mm/slub.c:3496)
[39977.886144] ? kmalloc_reserve (net/core/skbuff.c:571)
[39977.886172] ? __alloc_skb (net/core/skbuff.c:654)
[39977.886199] ? skb_segment (net/core/skbuff.c:4519)
[39977.886223] ? __udp_gso_segment (net/ipv4/udp_offload.c:290)
[39977.886252] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.886279] ? kasan_save_stack (mm/kasan/common.c:47)
[39977.886308] ? kasan_save_stack (mm/kasan/common.c:46)
[39977.886337] ? kasan_set_track (mm/kasan/common.c:52)
[39977.886364] ? __kasan_slab_alloc (mm/kasan/common.c:328)
[39977.886393] ? kmem_cache_alloc_node (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3496)
[39977.886422] ? kmalloc_reserve (net/core/skbuff.c:571)
[39977.886449] ? __alloc_skb (net/core/skbuff.c:654)
[39977.886475] ? skb_segment (net/core/skbuff.c:4519)
[39977.886499] ? __udp_gso_segment (net/ipv4/udp_offload.c:290)
[39977.886527] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.886551] ? ipv6_gso_segment (net/ipv6/ip6_offload.c:119
net/ipv6/ip6_offload.c:74)
[39977.886576] ? skb_mac_gso_segment (net/core/gro.c:141)
[39977.886604] ? __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
[39977.886629] ? udpv6_queue_rcv_skb (./include/net/udp.h:492
net/ipv6/udp.c:796 net/ipv6/udp.c:787)
[39977.886654] ? __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[39977.886678] ? ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437
(discriminator 4))
[39977.886707] ? ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[39977.886734] ? ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[39977.886760] ? ip6_mc_input (net/ipv6/ip6_input.c:591)
[39977.886786] ? ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[39977.886811] ? __netif_receive_skb_one_core (net/core/dev.c:5486)
[39977.886840] ? netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[39977.886868] ? br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[39977.886896] ? br_handle_frame_finish (net/bridge/br_input.c:216)
[39977.886925] ? br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[39977.886953] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[39977.886983] ? __netif_receive_skb_list_core (net/core/dev.c:5570)
[39977.887013] ? netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[39977.887042] ? napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[39977.887070] ? ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[39977.887097] ? __napi_poll (net/core/dev.c:6498)
[39977.887123] ? napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[39977.887151] ? kthread (kernel/kthread.c:379)
[39977.887177] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[39977.887200] ? _raw_write_lock_irq (kernel/locking/qspinlock.c:317)
[39977.887227] ? get_object (mm/kmemleak.c:608)
[39977.887249] kasan_set_track (mm/kasan/common.c:52)
[39977.887280] __kasan_slab_alloc (mm/kasan/common.c:328)
[39977.887312] kmem_cache_alloc (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3459 mm/slub.c:3466 mm/slub.c:3475)
[39977.887344] __create_object (mm/kmemleak.c:458 mm/kmemleak.c:635)
[39977.887370] ? kasan_set_track (mm/kasan/common.c:52)
[39977.887401] kmem_cache_alloc_node (./include/linux/kmemleak.h:42
mm/slab.h:714 mm/slub.c:3451 mm/slub.c:3496)
[39977.887434] kmalloc_reserve (net/core/skbuff.c:571)
[39977.887465] __alloc_skb (net/core/skbuff.c:654)
[39977.887495] ? __napi_build_skb (net/core/skbuff.c:627)
[39977.887525] ? __copy_skb_header (./include/net/dst.h:289
./include/net/dst.h:297 net/core/skbuff.c:1338)
[39977.887556] ? __copy_skb_header (./include/net/dst.h:289
./include/net/dst.h:297 net/core/skbuff.c:1338)
[39977.887589] skb_segment (net/core/skbuff.c:4519)
[39977.887615] ? write_profile (kernel/stacktrace.c:83)
[39977.887654] ? pskb_extract (net/core/skbuff.c:4360)
[39977.887678] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
[39977.887704] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
[39977.887736] __udp_gso_segment (net/ipv4/udp_offload.c:290)
[39977.887769] ? ip6_dst_destroy (net/ipv6/route.c:788)
[39977.887798] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[39977.887827] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
[39977.887854] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
[39977.887882] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
net/ipv6/ip6_offload.c:74)
[39977.887912] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
[39977.887938] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
[39977.887966] ? fib6_select_path (net/ipv6/route.c:458)
[39977.887997] skb_mac_gso_segment (net/core/gro.c:141)
[39977.888026] ? skb_eth_gso_segment (net/core/gro.c:127)
[39977.888054] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
[39977.888078] ? kasan_save_stack (mm/kasan/common.c:47)
[39977.888110] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
[39977.888139] udpv6_queue_rcv_skb (./include/net/udp.h:492
net/ipv6/udp.c:796 net/ipv6/udp.c:787)
[39977.888168] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[39977.888199] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437
(discriminator 4))
[39977.888235] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[39977.888265] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[39977.888293] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[39977.888321] ? ip6_route_del (net/ipv6/route.c:4013)
[39977.888348] ? ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:479)
[39977.888378] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[39977.888406] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[39977.888437] ip6_mc_input (net/ipv6/ip6_input.c:591)
[39977.888467] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[39977.888501] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[39977.888529] ? ip6_input (net/ipv6/ip6_input.c:303)
[39977.888556] ? stack_trace_save (kernel/stacktrace.c:123)
[39977.888585] ? ipv6_list_rcv (net/ipv6/ip6_input.c:70)
[39977.888617] ? ip6_input (net/ipv6/ip6_input.c:303)
[39977.888642] __netif_receive_skb_one_core (net/core/dev.c:5486)
[39977.888674] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[39977.888705] ? br_nf_dev_queue_xmit (net/bridge/br_netfilter_hooks.c:820)
[39977.888734] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[39977.888764] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[39977.888793] ? __netif_receive_skb (net/core/dev.c:5747)
[39977.888822] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[39977.888853] ? br_nf_forward_ip (net/bridge/br_netfilter_hooks.c:647)
[39977.888878] ? nf_hook_slow (net/netfilter/core.c:625)
[39977.888907] ? br_handle_vlan (net/bridge/br_vlan.c:483)
[39977.888936] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[39977.888967] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[39977.888999] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[39977.889033] br_handle_frame_finish (net/bridge/br_input.c:216)
[39977.889067] ? br_handle_local_finish (net/bridge/br_input.c:75)
[39977.889097] ? br_cfm_config_fill_info (net/bridge/br_cfm_netlink.c:510)
[39977.889126] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[39977.889154] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[39977.889178] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:655
(discriminator 3))
[39977.889209] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[39977.889234] ? write_profile (kernel/stacktrace.c:83)
[39977.889264] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[39977.889294] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[39977.889323] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[39977.889352] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[39977.889381] ? br_handle_local_finish (net/bridge/br_input.c:75)
[39977.889410] ? packet_rcv (net/packet/af_packet.c:2231)
[39977.889438] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[39977.889469] ? virtio_net_hdr_to_skb.constprop.0 (drivers/net/tun.c:753)
[39977.889498] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[39977.889530] ? udp_lib_lport_inuse (net/ipv4/udp.c:152)
[39977.889561] ? do_xdp_generic (net/core/dev.c:5281)
[39977.889591] ? udp4_lib_lookup2 (net/ipv4/udp.c:449 (discriminator 9))
[39977.889622] ? __udp4_lib_lookup (net/ipv4/udp.c:531)
[39977.889652] __netif_receive_skb_list_core (net/core/dev.c:5570)
[39977.889685] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[39977.889717] ? udp_push_pending_frames (net/ipv4/udp.c:495)
[39977.889749] ? kmem_cache_alloc_bulk (mm/slub.c:4033)
[39977.889778] ? napi_skb_cache_get (net/core/skbuff.c:338)
[39977.889807] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[39977.889836] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[39977.889870] ? process_backlog (net/core/dev.c:5699)
[39977.889898] ? udp4_gro_complete (net/ipv4/udp_offload.c:714)
[39977.889927] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:423)
[39977.889954] ? napi_gro_complete.constprop.0
(./include/net/gro.h:444 net/core/gro.c:328)
[39977.889985] ? napi_gro_flush (./arch/x86/include/asm/bitops.h:94
./include/asm-generic/bitops/instrumented-non-atomic.h:45
net/core/gro.c:346 net/core/gro.c:361)
[39977.890014] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[39977.890044] ? napi_busy_loop (net/core/dev.c:6034)
[39977.890076] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[39977.890115] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[39977.890145] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39977.890180] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[39977.890216] __napi_poll (net/core/dev.c:6498)
[39977.890246] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[39977.890278] ? __napi_poll (net/core/dev.c:6625)
[39977.890304] ? migrate_enable (kernel/sched/core.c:3045)
[39977.890337] ? __napi_poll (net/core/dev.c:6625)
[39977.890364] kthread (kernel/kthread.c:379)
[39977.890391] ? kthread_complete_and_exit (kernel/kthread.c:332)
[39977.890422] ret_from_fork (arch/x86/entry/entry_64.S:314)
[39977.890453] </TASK>
[40040.892521] rcu: INFO: rcu_preempt self-detected stall on CPU
[40040.898414] rcu: 2-....: (147006 ticks this GP)
idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=53043
[40040.908831] rcu: (t=147079 jiffies g=18175157 q=464422 ncpus=12)
[40040.915056] CPU: 2 PID: 913 Comm: napi/eno2-84 Tainted: G W
6.4.1-dirty #372
[40040.915084] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[40040.915101] RIP: 0010:__asan_load4 (mm/kasan/generic.c:172
mm/kasan/generic.c:258)
[40040.915134] Code: 00 00 be 02 00 00 00 e9 ba f4 ff ff 40 38 f0 7e
ec c3 cc cc cc cc 48 c1 e8 03 80 3c 10 00 75 ef c3 cc cc cc cc 90 48
8b 34 24 <48> 83 ff fb 77 52 48 b8 ff ff ff ff ff 7f ff ff 48 39 f8 73
43 48
All code
========
0: 00 00 add %al,(%rax)
2: be 02 00 00 00 mov $0x2,%esi
7: e9 ba f4 ff ff jmp 0xfffffffffffff4c6
c: 40 38 f0 cmp %sil,%al
f: 7e ec jle 0xfffffffffffffffd
11: c3 ret
12: cc int3
13: cc int3
14: cc int3
15: cc int3
16: 48 c1 e8 03 shr $0x3,%rax
1a: 80 3c 10 00 cmpb $0x0,(%rax,%rdx,1)
1e: 75 ef jne 0xf
20: c3 ret
21: cc int3
22: cc int3
23: cc int3
24: cc int3
25: 90 nop
26: 48 8b 34 24 mov (%rsp),%rsi
2a:* 48 83 ff fb cmp $0xfffffffffffffffb,%rdi <--
trapping instruction
2e: 77 52 ja 0x82
30: 48 b8 ff ff ff ff ff movabs $0xffff7fffffffffff,%rax
37: 7f ff ff
3a: 48 39 f8 cmp %rdi,%rax
3d: 73 43 jae 0x82
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 83 ff fb cmp $0xfffffffffffffffb,%rdi
4: 77 52 ja 0x58
6: 48 b8 ff ff ff ff ff movabs $0xffff7fffffffffff,%rax
d: 7f ff ff
10: 48 39 f8 cmp %rdi,%rax
13: 73 43 jae 0x58
15: 48 rex.W
[40040.915158] RSP: 0018:ffff888109f9e5e0 EFLAGS: 00000202
[40040.915186] RAX: 0000000000000001 RBX: ffff888109f9e600 RCX: ffffffffa1779901
[40040.915205] RDX: ffff888109f9ecd8 RSI: ffffffffa1778cea RDI: ffff888109f9e600
[40040.915225] RBP: ffff888109f9e688 R08: 0000000000000000 R09: ffffffffa5076637
[40040.915244] R10: fffffbfff4a0ecc6 R11: ffff8882794a12c0 R12: ffff888109f9e6b8
[40040.915263] R13: 0000000000000000 R14: ffff8881220a5000 R15: 0000000000000000
[40040.915281] FS: 0000000000000000(0000) GS:ffff8883ef700000(0000)
knlGS:0000000000000000
[40040.915302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40040.915320] CR2: 00007f21303f298c CR3: 00000002ab8a6000 CR4: 00000000003526e0
[40040.915339] Call Trace:
[40040.915353] <IRQ>
[40040.915370] ? rcu_dump_cpu_stacks (kernel/rcu/tree_stall.h:372)
[40040.915403] ? rcu_sched_clock_irq (kernel/rcu/tree_stall.h:692
kernel/rcu/tree_stall.h:774 kernel/rcu/tree.c:3822
kernel/rcu/tree.c:2214)
[40040.915434] ? resched_curr (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
./include/linux/thread_info.h:118 ./include/linux/sched.h:2050
./include/linux/sched.h:2065 kernel/sched/core.c:1048)
[40040.915462] ? wake_q_add_safe (kernel/sched/core.c:1042)
[40040.915488] ? rcu_note_context_switch (kernel/rcu/tree.c:2193)
[40040.915519] ? clear_buddies (kernel/sched/fair.c:4922)
[40040.915542] ? run_posix_cpu_timers
(./include/linux/sched/deadline.h:15
./include/linux/sched/deadline.h:22
kernel/time/posix-cpu-timers.c:1155
kernel/time/posix-cpu-timers.c:1451)
[40040.915570] ? clear_posix_cputimers_work
(kernel/time/posix-cpu-timers.c:1435)
[40040.915599] ? cpuacct_account_field (./include/linux/cgroup.h:437
kernel/sched/cpuacct.c:39 kernel/sched/cpuacct.c:354)
[40040.915627] ? hrtimer_run_queues (kernel/time/hrtimer.c:1900)
[40040.915656] ? update_process_times
(./arch/x86/include/asm/preempt.h:27 kernel/time/timer.c:2073)
[40040.915682] ? tick_sched_handle (kernel/time/tick-sched.c:255)
[40040.915711] ? tick_sched_timer (kernel/time/tick-sched.c:1497)
[40040.915739] ? tick_sched_do_timer (kernel/time/tick-sched.c:1479)
[40040.915769] ? __hrtimer_run_queues (kernel/time/hrtimer.c:1685
kernel/time/hrtimer.c:1749)
[40040.915797] ? enqueue_hrtimer (kernel/time/hrtimer.c:1719)
[40040.915823] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[40040.915855] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[40040.915882] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[40040.915908] ? ktime_get_update_offsets_now
(kernel/time/timekeeping.c:2323 (discriminator 3))
[40040.915939] ? hrtimer_interrupt (kernel/time/hrtimer.c:1814)
[40040.915970] ? __sysvec_apic_timer_interrupt
(./arch/x86/include/asm/jump_label.h:27
./include/linux/jump_label.h:207
./arch/x86/include/asm/trace/irq_vectors.h:41
arch/x86/kernel/apic/apic.c:1113)
[40040.915999] ? sysvec_apic_timer_interrupt
(arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[40040.916025] </IRQ>
[40040.916038] <TASK>
[40040.916051] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[40040.916087] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:628)
[40040.916117] ? unwind_get_return_address (arch/x86/kernel/unwind_orc.c:341)
[40040.916149] ? __asan_load4 (mm/kasan/generic.c:172 mm/kasan/generic.c:258)
[40040.916173] unwind_get_return_address (arch/x86/kernel/unwind_orc.c:341)
[40040.916204] ? write_profile (kernel/stacktrace.c:83)
[40040.916233] arch_stack_walk (arch/x86/kernel/stacktrace.c:26)
[40040.916260] ? __udp_gso_segment (net/ipv4/udp_offload.c:290)
[40040.916294] stack_trace_save (kernel/stacktrace.c:123)
[40040.916323] ? filter_irq_stacks (kernel/stacktrace.c:114)
[40040.916353] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[40040.916378] kasan_save_stack (mm/kasan/common.c:46)
[40040.916408] ? kasan_save_stack (mm/kasan/common.c:46)
[40040.916436] ? kasan_set_track (mm/kasan/common.c:52)
[40040.916463] ? __kasan_slab_alloc (mm/kasan/common.c:328)
[40040.916491] ? kmem_cache_alloc (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3459 mm/slub.c:3466 mm/slub.c:3475)
[40040.916518] ? __create_object (mm/kmemleak.c:458 mm/kmemleak.c:635)
[40040.916542] ? kmem_cache_alloc_node (./include/linux/kmemleak.h:42
mm/slab.h:714 mm/slub.c:3451 mm/slub.c:3496)
[40040.916569] ? __alloc_skb (net/core/skbuff.c:644)
[40040.916597] ? skb_segment (net/core/skbuff.c:4519)
[40040.916621] ? __stack_depot_save (lib/stackdepot.c:379)
[40040.916653] ? kasan_save_stack (mm/kasan/common.c:47)
[40040.916681] ? kasan_save_stack (mm/kasan/common.c:46)
[40040.916709] ? kasan_set_track (mm/kasan/common.c:52)
[40040.916736] ? __kasan_slab_alloc (mm/kasan/common.c:328)
[40040.916765] ? kmem_cache_alloc_node (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3496)
[40040.916793] ? __alloc_skb (net/core/skbuff.c:644)
[40040.916820] ? skb_segment (net/core/skbuff.c:4519)
[40040.916843] ? __udp_gso_segment (net/ipv4/udp_offload.c:290)
[40040.916871] ? udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[40040.916895] ? ipv6_gso_segment (net/ipv6/ip6_offload.c:119
net/ipv6/ip6_offload.c:74)
[40040.916920] ? skb_mac_gso_segment (net/core/gro.c:141)
[40040.916948] ? _raw_spin_lock_irqsave
(./arch/x86/include/asm/atomic.h:202
./include/linux/atomic/atomic-instrumented.h:543
./include/asm-generic/qspinlock.h:111 ./include/linux/spinlock.h:186
./include/linux/spinlock_api_smp.h:111 kernel/locking/spinlock.c:162)
[40040.916978] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[40040.917008] ? ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[40040.917035] ? netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[40040.917062] ? br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[40040.917090] ? br_handle_frame_finish (net/bridge/br_input.c:216)
[40040.917119] ? br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[40040.917147] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[40040.917177] ? __netif_receive_skb_list_core (net/core/dev.c:5570)
[40040.917206] ? _raw_spin_unlock_irqrestore
(./include/asm-generic/qspinlock.h:128 ./include/linux/spinlock.h:203
./include/linux/spinlock_api_smp.h:150 kernel/locking/spinlock.c:194)
[40040.917237] ? __create_object (mm/kmemleak.c:458 mm/kmemleak.c:635)
[40040.917262] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[40040.917294] ? _raw_read_unlock_irqrestore (kernel/locking/spinlock.c:161)
[40040.917324] kasan_set_track (mm/kasan/common.c:52)
[40040.917353] __kasan_slab_alloc (mm/kasan/common.c:328)
[40040.917385] kmem_cache_alloc (mm/slab.h:711 mm/slub.c:3451
mm/slub.c:3459 mm/slub.c:3466 mm/slub.c:3475)
[40040.917417] __create_object (mm/kmemleak.c:458 mm/kmemleak.c:635)
[40040.917442] ? kasan_set_track (mm/kasan/common.c:52)
[40040.917472] kmem_cache_alloc_node (./include/linux/kmemleak.h:42
mm/slab.h:714 mm/slub.c:3451 mm/slub.c:3496)
[40040.917505] __alloc_skb (net/core/skbuff.c:644)
[40040.917535] ? __napi_build_skb (net/core/skbuff.c:627)
[40040.917566] ? skb_segment (./include/linux/skbuff.h:2791
net/core/skbuff.c:4539)
[40040.917593] skb_segment (net/core/skbuff.c:4519)
[40040.917619] ? write_profile (kernel/stacktrace.c:83)
[40040.917657] ? pskb_extract (net/core/skbuff.c:4360)
[40040.917680] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
[40040.917706] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
[40040.917737] __udp_gso_segment (net/ipv4/udp_offload.c:290)
[40040.917769] ? ip6_dst_destroy (net/ipv6/route.c:788)
[40040.917798] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
[40040.917826] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
[40040.917852] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
[40040.917881] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
net/ipv6/ip6_offload.c:74)
[40040.917909] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
[40040.917935] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
[40040.917963] ? fib6_select_path (net/ipv6/route.c:458)
[40040.917993] skb_mac_gso_segment (net/core/gro.c:141)
[40040.918021] ? skb_eth_gso_segment (net/core/gro.c:127)
[40040.918049] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
[40040.918073] ? kasan_save_stack (mm/kasan/common.c:47)
[40040.918104] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
[40040.918132] udpv6_queue_rcv_skb (./include/net/udp.h:492
net/ipv6/udp.c:796 net/ipv6/udp.c:787)
[40040.918161] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[40040.918191] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:437
(discriminator 4))
[40040.918227] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[40040.918256] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[40040.918283] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[40040.918312] ? ip6_route_del (net/ipv6/route.c:4013)
[40040.918338] ? ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:479)
[40040.918368] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[40040.918394] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[40040.918424] ip6_mc_input (net/ipv6/ip6_input.c:591)
[40040.918454] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[40040.918487] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[40040.918515] ? ip6_input (net/ipv6/ip6_input.c:303)
[40040.918541] ? stack_trace_save (kernel/stacktrace.c:123)
[40040.918571] ? ipv6_list_rcv (net/ipv6/ip6_input.c:70)
[40040.918602] ? ip6_input (net/ipv6/ip6_input.c:303)
[40040.918627] __netif_receive_skb_one_core (net/core/dev.c:5486)
[40040.918658] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[40040.918688] ? br_nf_dev_queue_xmit (net/bridge/br_netfilter_hooks.c:820)
[40040.918717] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[40040.918746] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[40040.918775] ? __netif_receive_skb (net/core/dev.c:5747)
[40040.918803] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[40040.918834] ? br_nf_forward_ip (net/bridge/br_netfilter_hooks.c:647)
[40040.918859] ? nf_hook_slow (net/netfilter/core.c:625)
[40040.918887] ? br_handle_vlan (net/bridge/br_vlan.c:483)
[40040.918916] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[40040.918946] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[40040.918978] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[40040.919011] br_handle_frame_finish (net/bridge/br_input.c:216)
[40040.919044] ? br_handle_local_finish (net/bridge/br_input.c:75)
[40040.919074] ? br_cfm_config_fill_info (net/bridge/br_cfm_netlink.c:510)
[40040.919102] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[40040.919130] ? ret_from_fork (arch/x86/entry/entry_64.S:308)
[40040.919153] ? unwind_next_frame (arch/x86/kernel/unwind_orc.c:655
(discriminator 3))
[40040.919183] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[40040.919208] ? write_profile (kernel/stacktrace.c:83)
[40040.919238] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[40040.919266] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[40040.919295] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[40040.919324] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:382
kernel/rcu/tree_plugin.h:421)
[40040.919352] ? br_handle_local_finish (net/bridge/br_input.c:75)
[40040.919381] ? packet_rcv (net/packet/af_packet.c:2231)
[40040.919408] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[40040.919438] ? virtio_net_hdr_to_skb.constprop.0 (drivers/net/tun.c:753)
[40040.919467] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[40040.919498] ? udp_lib_lport_inuse (net/ipv4/udp.c:152)
[40040.919528] ? do_xdp_generic (net/core/dev.c:5281)
[40040.919557] ? udp4_lib_lookup2 (net/ipv4/udp.c:449 (discriminator 9))
[40040.919588] ? __udp4_lib_lookup (net/ipv4/udp.c:531)
[40040.919618] __netif_receive_skb_list_core (net/core/dev.c:5570)
[40040.919651] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[40040.919682] ? udp_push_pending_frames (net/ipv4/udp.c:495)
[40040.919714] ? kmem_cache_alloc_bulk (mm/slub.c:4033)
[40040.919743] ? napi_skb_cache_get (net/core/skbuff.c:338)
[40040.919772] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[40040.919801] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[40040.919834] ? process_backlog (net/core/dev.c:5699)
[40040.919861] ? udp4_gro_complete (net/ipv4/udp_offload.c:714)
[40040.919890] ? __rcu_read_unlock (kernel/rcu/tree_plugin.h:423)
[40040.919917] ? napi_gro_complete.constprop.0
(./include/net/gro.h:444 net/core/gro.c:328)
[40040.919948] ? napi_gro_flush (./arch/x86/include/asm/bitops.h:94
./include/asm-generic/bitops/instrumented-non-atomic.h:45
net/core/gro.c:346 net/core/gro.c:361)
[40040.919977] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[40040.920007] ? napi_busy_loop (net/core/dev.c:6034)
[40040.920039] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[40040.920077] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[40040.920107] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[40040.920140] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[40040.920177] __napi_poll (net/core/dev.c:6498)
[40040.920206] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[40040.920238] ? __napi_poll (net/core/dev.c:6625)
[40040.920264] ? migrate_enable (kernel/sched/core.c:3045)
[40040.920296] ? __napi_poll (net/core/dev.c:6625)
[40040.920323] kthread (kernel/kthread.c:379)
[40040.920349] ? kthread_complete_and_exit (kernel/kthread.c:332)
[40040.920380] ret_from_fork (arch/x86/entry/entry_64.S:314)
[40040.920410] </TASK>
[40065.078922] ------------[ cut here ]------------
[40065.078949] NETDEV WATCHDOG: eno2 (ixgbe): transmit queue 7 timed out 7960 ms
[40065.079126] WARNING: CPU: 8 PID: 0 at net/sched/sch_generic.c:525
dev_watchdog (net/sched/sch_generic.c:525 (discriminator 3))
[40065.079165] Modules linked in: chaoskey
[40065.079200] CPU: 8 PID: 0 Comm: swapper/8 Tainted: G W
6.4.1-dirty #372
[40065.079227] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[40065.079244] RIP: 0010:dev_watchdog (net/sched/sch_generic.c:525
(discriminator 3))
[40065.079271] Code: 8b 3c 24 c6 05 07 e2 ad 01 01 4c 89 ff e8 59 60
f6 ff 45 89 f0 44 89 e9 4c 89 fe 48 89 c2 48 c7 c7 40 26 c1 a3 e8 71
97 a5 fe <0f> 0b e9 f5 fe ff ff 66 2e 0f 1f 84 00 00 00 00 00 41 57 41
56 41
All code
========
0: 8b 3c 24 mov (%rsp),%edi
3: c6 05 07 e2 ad 01 01 movb $0x1,0x1ade207(%rip) # 0x1ade211
a: 4c 89 ff mov %r15,%rdi
d: e8 59 60 f6 ff call 0xfffffffffff6606b
12: 45 89 f0 mov %r14d,%r8d
15: 44 89 e9 mov %r13d,%ecx
18: 4c 89 fe mov %r15,%rsi
1b: 48 89 c2 mov %rax,%rdx
1e: 48 c7 c7 40 26 c1 a3 mov $0xffffffffa3c12640,%rdi
25: e8 71 97 a5 fe call 0xfffffffffea5979b
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 f5 fe ff ff jmp 0xffffffffffffff26
31: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
38: 00 00 00
3b: 41 57 push %r15
3d: 41 56 push %r14
3f: 41 rex.B

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 f5 fe ff ff jmp 0xfffffffffffffefc
7: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
e: 00 00 00
11: 41 57 push %r15
13: 41 56 push %r14
15: 41 rex.B
[40065.079296] RSP: 0018:ffff8883efa09dc0 EFLAGS: 00010286
[40065.079322] RAX: 0000000000000000 RBX: ffff88810a0c8400 RCX: 0000000000000027
[40065.079341] RDX: 0000000000000027 RSI: ffffffffa1a0d41e RDI: ffff8883efa273c8
[40065.079361] RBP: ffff88810a0c83dc R08: 0000000000000001 R09: ffff8883efa273cb
[40065.079379] R10: ffffed107df44e79 R11: 0000000000000001 R12: 00000001025ea4e9
[40065.079398] R13: 0000000000000007 R14: 0000000000001f18 R15: ffff88810a0c8000
[40065.079416] FS: 0000000000000000(0000) GS:ffff8883efa00000(0000)
knlGS:0000000000000000
[40065.079437] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[40065.079455] CR2: 00007f21286b0ec8 CR3: 0000000105000000 CR4: 00000000003526e0
[40065.079474] Call Trace:
[40065.079490] <IRQ>
[40065.079505] ? __warn (kernel/panic.c:673)
[40065.079533] ? dev_watchdog (net/sched/sch_generic.c:525 (discriminator 3))
[40065.079558] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[40065.079588] ? handle_bug (arch/x86/kernel/traps.c:324)
[40065.079616] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[40065.079643] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[40065.079678] ? irq_work_claim (./arch/x86/include/asm/atomic.h:29
./arch/x86/include/asm/atomic.h:240
./include/linux/atomic/atomic-instrumented.h:407 kernel/irq_work.c:61)
[40065.079706] ? dev_watchdog (net/sched/sch_generic.c:525 (discriminator 3))
[40065.079733] ? qdisc_free_cb (net/sched/sch_generic.c:496)
[40065.079757] ? qdisc_free_cb (net/sched/sch_generic.c:496)
[40065.079780] call_timer_fn (./arch/x86/include/asm/jump_label.h:27
./include/linux/jump_label.h:207 ./include/trace/events/timer.h:127
kernel/time/timer.c:1701)
[40065.079807] ? qdisc_free_cb (net/sched/sch_generic.c:496)
[40065.079831] __run_timers.part.0 (kernel/time/timer.c:1752
kernel/time/timer.c:2022)
[40065.079860] ? call_timer_fn (kernel/time/timer.c:1995)
[40065.079884] ? rebalance_domains (kernel/sched/fair.c:11239)
[40065.079946] ? __hrtimer_next_event_base (kernel/time/hrtimer.c:525)
[40065.079980] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[40065.080008] ? ktime_get (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:848 (discriminator 3))
[40065.080034] ? __sysvec_spurious_apic_interrupt
(./arch/x86/include/asm/barrier.h:99 arch/x86/kernel/apic/apic.c:488)
[40065.080067] run_timer_softirq (kernel/time/timer.c:2000
kernel/time/timer.c:2037)
[40065.080094] __do_softirq (./arch/x86/include/asm/jump_label.h:27
./include/linux/jump_label.h:207 ./include/trace/events/irq.h:142
kernel/softirq.c:572)
[40065.080123] irq_exit_rcu (kernel/softirq.c:445 kernel/softirq.c:650
kernel/softirq.c:662)
[40065.080153] sysvec_apic_timer_interrupt
(arch/x86/kernel/apic/apic.c:1106 (discriminator 14))
[40065.080181] </IRQ>
[40065.080195] <TASK>
[40065.080208] asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[40065.080242] RIP: 0010:cpuidle_enter_state (drivers/cpuidle/cpuidle.c:291)
[40065.080268] Code: 48 83 3c 03 00 0f 84 28 01 00 00 83 ea 01 73 e4
48 83 c4 08 44 89 e0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc fb
45 85 e4 <0f> 89 6b ff ff ff 4b 8d 44 6d 00 48 c7 43 18 00 00 00 00 48
c1 e0
All code
========
0: 48 83 3c 03 00 cmpq $0x0,(%rbx,%rax,1)
5: 0f 84 28 01 00 00 je 0x133
b: 83 ea 01 sub $0x1,%edx
e: 73 e4 jae 0xfffffffffffffff4
10: 48 83 c4 08 add $0x8,%rsp
14: 44 89 e0 mov %r12d,%eax
17: 5b pop %rbx
18: 5d pop %rbp
19: 41 5c pop %r12
1b: 41 5d pop %r13
1d: 41 5e pop %r14
1f: 41 5f pop %r15
21: c3 ret
22: cc int3
23: cc int3
24: cc int3
25: cc int3
26: fb sti
27: 45 85 e4 test %r12d,%r12d
2a:* 0f 89 6b ff ff ff jns 0xffffffffffffff9b <-- trapping instruction
30: 4b 8d 44 6d 00 lea 0x0(%r13,%r13,2),%rax
35: 48 c7 43 18 00 00 00 movq $0x0,0x18(%rbx)
3c: 00
3d: 48 rex.W
3e: c1 .byte 0xc1
3f: e0 .byte 0xe0

Code starting with the faulting instruction
===========================================
0: 0f 89 6b ff ff ff jns 0xffffffffffffff71
6: 4b 8d 44 6d 00 lea 0x0(%r13,%r13,2),%rax
b: 48 c7 43 18 00 00 00 movq $0x0,0x18(%rbx)
12: 00
13: 48 rex.W
14: c1 .byte 0xc1
15: e0 .byte 0xe0
[40065.080292] RSP: 0018:ffff888100bffe18 EFLAGS: 00000202
[40065.080318] RAX: 0000000000000000 RBX: ffffe8ffffc00000 RCX: ffffffffa18f0680
[40065.080337] RDX: dffffc0000000000 RSI: 0000000000000008 RDI: ffff8883efa38928
[40065.080356] RBP: ffffffffa44d78e0 R08: 0000000000000000 R09: 0000000040000000
[40065.080374] R10: ffffed107df46834 R11: 0000000000000000 R12: 0000000000000002
[40065.080392] R13: 0000000000000002 R14: 0000247060cbd11a R15: 0000000000000000
[40065.080414] ? sched_idle_set_state (kernel/sched/sched.h:2341
kernel/sched/idle.c:19)
[40065.080450] ? cpuidle_enter_state (drivers/cpuidle/cpuidle.c:288)
[40065.080477] cpuidle_enter (drivers/cpuidle/cpuidle.c:390)
[40065.080506] do_idle (kernel/sched/idle.c:219 kernel/sched/idle.c:282)
[40065.080535] ? arch_cpu_idle_exit+0x30/0x30
[40065.080564] ? schedule_idle (./arch/x86/include/asm/bitops.h:207
(discriminator 1) ./arch/x86/include/asm/bitops.h:239 (discriminator
1) ./include/linux/thread_info.h:184 (discriminator 1)
./include/linux/sched.h:2244 (discriminator 1)
kernel/sched/core.c:6774 (discriminator 1))
[40065.080590] ? finish_task_switch.isra.0 (kernel/sched/core.c:4965
kernel/sched/core.c:5211)
[40065.080620] cpu_startup_entry (kernel/sched/idle.c:378 (discriminator 1))
[40065.080649] start_secondary (arch/x86/kernel/smpboot.c:288)
[40065.080677] secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:370)
[40065.080710] </TASK>
[40065.080724] ---[ end trace 0000000000000000 ]---
[40065.080765] ixgbe 0000:06:00.1 eno2: initiating reset due to tx timeout
[40065.080842] ixgbe 0000:06:00.1 eno2: Reset adapter
[40065.085843] ixgbe 0000:06:00.1 eno2: NIC Link is Down

and then the machine rebooted...

On Thu, Jun 29, 2023 at 12:50 PM Ian Kumlien <[email protected]> wrote:
>
> On Wed, Jun 28, 2023 at 10:18 PM Ian Kumlien <[email protected]> wrote:
> >
> > On Wed, Jun 28, 2023 at 5:15 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Wed, 2023-06-28 at 14:04 +0200, Ian Kumlien wrote:
> > > > So have some hits, would it be better without your warn on? ... Things
> > > > are a bit slow atm - lets just say that i noticed the stacktraces
> > > > because a stream stuttered =)
> > >
> > > Sorry, I screwed-up completely a newly added check.
> >
> > Thats ok
> >
> > > If you have Kasan enabled you can simply and more safely remove my 2nd
> > > patch. Kasan should be able to catch all the out-of-buffer scenarios
> > > such checks were intended to prevent.
> >
> > I thought I'd run without any of the patches, preparing for that now,
> > but i have to stop testing tomorrow and will continue on monday if i
> > don't catch anything
>
> So, KASAN caught the null pointer derefs, as expected, but it caught
> two of them which i didn't expect.
>
> Anyway, I'm off for the weekend so, I hope to be able to send
> something better on Monday, fyi
>
> > > Cheers,
> > >
> > > Paolo
> > >

2023-07-04 10:26:59

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Mon, 2023-07-03 at 11:37 +0200, Ian Kumlien wrote:
> So, got back, switched to 6.4.1 and reran with kmemleak and kasan
>
> I got the splat from:
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index cea28d30abb5..701c1b5cf532 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4328,6 +4328,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb->prev = tail;
>
> + if (WARN_ON_ONCE(!skb->next))
> + goto err_linearize;
> +
> if (skb_needs_linearize(skb, features) &&
> __skb_linearize(skb))
> goto err_linearize;
>
> I'm just happy i ran with dmesg -W since there was only minimal output
> on the console:
> [39914.833696] rcu: INFO: rcu_preempt self-detected stall on CPU
> [39914.839598] rcu: 2-....: (20997 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=4687
> [39914.849839] rcu: (t=21017 jiffies g=18175157 q=45473 ncpus=12)
> [39977.862108] rcu: INFO: rcu_preempt self-detected stall on CPU
> [39977.868002] rcu: 2-....: (84001 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=28434
> [39977.878340] rcu: (t=84047 jiffies g=18175157 q=263477 ncpus=12)
> [40040.892521] rcu: INFO: rcu_preempt self-detected stall on CPU
> [40040.898414] rcu: 2-....: (147006 ticks this GP)
> idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=53043
> [40040.908831] rcu: (t=147079 jiffies g=18175157 q=464422 ncpus=12)
> [40065.080842] ixgbe 0000:06:00.1 eno2: Reset adapter

Ouch, just another slightly different issue, apparently :(

I'll try some wild guesses. The rcu stall could cause the OOM observed
in the previous tests. Here we the OOM did not trigger because due to
kasan/kmemleak the kernel is able to process a lesser number of packets
in the same period of time.

[...]
> [39914.857231] skb_segment (net/core/skbuff.c:4519)

I *think* this could be looping "forever", if gso_size becomes 0, which
is in turn completely unexpected ...


> [39914.857257] ? write_profile (kernel/stacktrace.c:83)
> [39914.857296] ? pskb_extract (net/core/skbuff.c:4360)
> [39914.857320] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
> [39914.857346] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
> [39914.857379] __udp_gso_segment (net/ipv4/udp_offload.c:290)
> [39914.857413] ? ip6_dst_destroy (net/ipv6/route.c:788)
> [39914.857442] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
> [39914.857472] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
> [39914.857498] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
> [39914.857528] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
> net/ipv6/ip6_offload.c:74)
> [39914.857557] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
> [39914.857583] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
> [39914.857612] ? fib6_select_path (net/ipv6/route.c:458)
> [39914.857643] skb_mac_gso_segment (net/core/gro.c:141)
> [39914.857673] ? skb_eth_gso_segment (net/core/gro.c:127)
> [39914.857702] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
> [39914.857726] ? kasan_save_stack (mm/kasan/common.c:47)
> [39914.857758] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
> [39914.857787] udpv6_queue_rcv_skb (./include/net/udp.h:492
> net/ipv6/udp.c:796 net/ipv6/udp.c:787)
> [39914.857816] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)

... but this means we are processing a multicast packet, likely skb is
cloned. If one of the clone instance enters simultaneusly
skb_segment_list() the latter would inconditionally call:

skb_gso_reset(skb);

clearing the gso area in the shared info and causing unexpected results
(possibly the memory corruption observed before, and the above RCU
stall) for the other clone instances.

Assuming there are no other issues and that the above is not just a
side effect of ENOCOFFEE here, the following should possibly solve,
could you please add it to your testbed? (still with kasan+previous
patch, kmemleak is possibly not needed).

Thanks!
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..ac1ca6c7bff9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4263,6 +4263,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb_shinfo(skb)->frag_list = NULL;

+ /* later code will clear the gso area in the shared info */
+ err = skb_header_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
while (list_skb) {
nskb = list_skb;
list_skb = list_skb->next;


2023-07-04 12:03:17

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

Propper bug this time:
cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[ 226.265742] ------------[ cut here ]------------
[ 226.265767] kernel BUG at include/linux/skbuff.h:2645!
[ 226.271087] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 226.276962] CPU: 2 PID: 948 Comm: napi/eno2-81 Tainted: G W
6.4.1-dirty #374
[ 226.285520] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 226.293717] RIP: 0010:udpv6_queue_rcv_skb
(./include/linux/skbuff.h:2645 net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.298901] Code: 04 00 00 b9 01 00 00 00 48 89 de e8 bb ac f7 ff 4d
85 f6 75 97 48 83 c4 08 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc
cc cc <0f> 0b 0f 0b e9 b1 fe ff ff e8 f3 38 41 fd e9 63 fb ff ff 4c 89
e7
All code
========
0: 04 00 add $0x0,%al
2: 00 b9 01 00 00 00 add %bh,0x1(%rcx)
8: 48 89 de mov %rbx,%rsi
b: e8 bb ac f7 ff call 0xfffffffffff7accb
10: 4d 85 f6 test %r14,%r14
13: 75 97 jne 0xffffffffffffffac
15: 48 83 c4 08 add $0x8,%rsp
19: 31 c0 xor %eax,%eax
1b: 5b pop %rbx
1c: 5d pop %rbp
1d: 41 5c pop %r12
1f: 41 5d pop %r13
21: 41 5e pop %r14
23: 41 5f pop %r15
25: c3 ret
26: cc int3
27: cc int3
28: cc int3
29: cc int3
2a:* 0f 0b ud2 <-- trapping instruction
2c: 0f 0b ud2
2e: e9 b1 fe ff ff jmp 0xfffffffffffffee4
33: e8 f3 38 41 fd call 0xfffffffffd41392b
38: e9 63 fb ff ff jmp 0xfffffffffffffba0
3d: 4c 89 e7 mov %r12,%rdi

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 0f 0b ud2
4: e9 b1 fe ff ff jmp 0xfffffffffffffeba
9: e8 f3 38 41 fd call 0xfffffffffd413901
e: e9 63 fb ff ff jmp 0xfffffffffffffb76
13: 4c 89 e7 mov %r12,%rdi
[ 226.317801] RSP: 0018:ffff888107b8ee78 EFLAGS: 00010207
[ 226.323151] RAX: 0000000000000007 RBX: ffff888109dfe8c0 RCX: 0000000000000000
[ 226.330403] RDX: ffff8881227200c0 RSI: 0000000000000004 RDI: ffff888109dfe934
[ 226.337646] RBP: 0000000000000036 R08: 0000000000000001 R09: ffff888109dfe997
[ 226.344892] R10: ffffed10213bfd32 R11: 00c000ce00f6dd86 R12: dffffc0000000000
[ 226.352137] R13: ffff8881186ae780 R14: ffff888109dff2c0 R15: 0000000000000025
[ 226.359383] FS: 0000000000000000(0000) GS:ffff8883ef300000(0000)
knlGS:0000000000000000
[ 226.367590] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 226.373449] CR2: 00007f40938086d0 CR3: 00000001197c0000 CR4: 00000000003526e0
[ 226.380705] Call Trace:
[ 226.383282] <TASK>
[ 226.385507] ? die (arch/x86/kernel/dumpstack.c:421
arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
[ 226.388521] ? do_trap (arch/x86/kernel/traps.c:124
arch/x86/kernel/traps.c:165)
[ 226.392056] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.396635] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.401211] ? handle_invalid_op (arch/x86/kernel/traps.c:88
arch/x86/kernel/traps.c:186 arch/x86/kernel/traps.c:297)
[ 226.405438] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.410016] ? exc_invalid_op (arch/x86/kernel/traps.c:352)
[ 226.413983] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 226.418305] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.422886] ? udpv6_queue_rcv_skb (net/ipv6/udp.c:797 net/ipv6/udp.c:787)
[ 226.427461] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[ 226.431606] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 226.436186] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:439
(discriminator 4))
[ 226.441116] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[ 226.445174] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[ 226.448623] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[ 226.452939] ? ip6_route_del (net/ipv6/route.c:3949 net/ipv6/route.c:4090)
[ 226.457082] ? ip6_protocol_deliver_rcu
(./include/linux/skbuff.h:4180 net/ipv6/ip6_input.c:480)
[ 226.462266] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 226.466840] ip6_mc_input (net/ipv6/ip6_input.c:591)
[ 226.470639] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[ 226.474783] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[ 226.478229] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 226.481937] ? kthread (kernel/kthread.c:379)
[ 226.485473] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 226.489348] ? ipv6_mc_validate_checksum (net/ipv6/mcast_snoop.c:173)
[ 226.494443] ? ipv6_list_rcv (./include/net/l3mdev.h:169
./include/net/l3mdev.h:190 net/ipv6/ip6_input.c:74)
[ 226.498500] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 226.503334] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 226.507045] __netif_receive_skb_one_core (net/core/dev.c:5486)
[ 226.512237] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[ 226.517688] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 226.521569] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[ 226.525887] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[ 226.530114] ? __netif_receive_skb (net/core/dev.c:5747)
[ 226.534691] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[ 226.540574] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 226.545066] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 226.548949] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 226.553091] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[ 226.557579] ? br_multicast_flood (net/bridge/br_forward.c:126
net/bridge/br_forward.c:336)
[ 226.562154] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 226.566992] ? br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 226.571311] br_handle_frame_finish (net/bridge/br_input.c:216)
[ 226.576085] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 226.580753] ? br_nf_dev_queue_xmit
(net/bridge/br_netfilter_hooks.c:165
net/bridge/br_netfilter_hooks.c:778)
[ 226.585503] ? sysvec_call_function_single (arch/x86/kernel/smp.c:262)
[ 226.590607] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 226.595792] ? migrate_enable (kernel/sched/core.c:2291)
[ 226.599942] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 226.605124] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 226.609267] ? asm_sysvec_apic_timer_interrupt
(./arch/x86/include/asm/idtentry.h:645)
[ 226.614720] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 226.619755] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 226.624434] ? packet_rcv (net/packet/af_packet.c:2231)
[ 226.628337] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 226.634336] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 226.639378] ? do_xdp_generic (net/core/dev.c:5281)
[ 226.643543] ? write_profile (kernel/stacktrace.c:86)
[ 226.647631] ? udp6_lib_lookup2 (net/ipv6/udp.c:199)
[ 226.651972] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 226.657277] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 226.663531] ? kasan_save_stack (mm/kasan/common.c:46)
[ 226.667698] ? kasan_set_track (mm/kasan/common.c:52)
[ 226.671768] ? __kasan_slab_alloc (mm/kasan/common.c:328)
[ 226.676095] ? ipv6_portaddr_hash.isra.0 (net/ipv4/inet_hashtables.c:300)
[ 226.681198] ? napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 226.685684] ? kthread (kernel/kthread.c:379)
[ 226.689221] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 226.693105] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 226.697507] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 226.702176] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 226.707643] ? process_backlog (net/core/dev.c:5699)
[ 226.711884] ? setup_object (mm/slub.c:1832)
[ 226.715681] ? ipv6_gro_receive (net/ipv6/ip6_offload.c:281 (discriminator 7))
[ 226.720085] napi_gro_complete.constprop.0
(./include/linux/list.h:37 ./include/net/gro.h:434
./include/net/gro.h:446 net/core/gro.c:328)
[ 226.725365] dev_gro_receive (net/core/gro.c:553)
[ 226.729515] ? ixgbe_alloc_rx_buffers
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:1602)
[ 226.734361] napi_gro_receive (net/core/gro.c:657)
[ 226.738511] ixgbe_poll
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2420
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
[ 226.742322] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 226.747949] ? common_interrupt (arch/x86/kernel/irq.c:240)
[ 226.752111] __napi_poll (net/core/dev.c:6498)
[ 226.755741] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 226.760067] ? __napi_poll (net/core/dev.c:6625)
[ 226.763952] ? migrate_enable (kernel/sched/core.c:3045)
[ 226.768100] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 226.772341] ? __napi_poll (net/core/dev.c:6625)
[ 226.776227] kthread (kernel/kthread.c:379)
[ 226.779589] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 226.784518] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 226.788233] </TASK>
[ 226.790540] Modules linked in: chaoskey
[ 226.794603] ---[ end trace 0000000000000000 ]---
[ 226.829752] pstore: backend (erst) writing error (-28)
[ 226.835076] RIP: 0010:udpv6_queue_rcv_skb
(./include/linux/skbuff.h:2645 net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 226.840291] Code: 04 00 00 b9 01 00 00 00 48 89 de e8 bb ac f7 ff 4d
85 f6 75 97 48 83 c4 08 31 c0 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc
cc cc <0f> 0b 0f 0b e9 b1 fe ff ff e8 f3 38 41 fd e9 63 fb ff ff 4c 89
e7
All code
========
0: 04 00 add $0x0,%al
2: 00 b9 01 00 00 00 add %bh,0x1(%rcx)
8: 48 89 de mov %rbx,%rsi
b: e8 bb ac f7 ff call 0xfffffffffff7accb
10: 4d 85 f6 test %r14,%r14
13: 75 97 jne 0xffffffffffffffac
15: 48 83 c4 08 add $0x8,%rsp
19: 31 c0 xor %eax,%eax
1b: 5b pop %rbx
1c: 5d pop %rbp
1d: 41 5c pop %r12
1f: 41 5d pop %r13
21: 41 5e pop %r14
23: 41 5f pop %r15
25: c3 ret
26: cc int3
27: cc int3
28: cc int3
29: cc int3
2a:* 0f 0b ud2 <-- trapping instruction
2c: 0f 0b ud2
2e: e9 b1 fe ff ff jmp 0xfffffffffffffee4
33: e8 f3 38 41 fd call 0xfffffffffd41392b
38: e9 63 fb ff ff jmp 0xfffffffffffffba0
3d: 4c 89 e7 mov %r12,%rdi

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 0f 0b ud2
4: e9 b1 fe ff ff jmp 0xfffffffffffffeba
9: e8 f3 38 41 fd call 0xfffffffffd413901
e: e9 63 fb ff ff jmp 0xfffffffffffffb76
13: 4c 89 e7 mov %r12,%rdi
[ 226.859293] RSP: 0018:ffff888107b8ee78 EFLAGS: 00010207
[ 226.864692] RAX: 0000000000000007 RBX: ffff888109dfe8c0 RCX: 0000000000000000
[ 226.871994] RDX: ffff8881227200c0 RSI: 0000000000000004 RDI: ffff888109dfe934
[ 226.879294] RBP: 0000000000000036 R08: 0000000000000001 R09: ffff888109dfe997
[ 226.886588] R10: ffffed10213bfd32 R11: 00c000ce00f6dd86 R12: dffffc0000000000
[ 226.893899] R13: ffff8881186ae780 R14: ffff888109dff2c0 R15: 0000000000000025
[ 226.901198] FS: 0000000000000000(0000) GS:ffff8883ef300000(0000)
knlGS:0000000000000000
[ 226.909470] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 226.915379] CR2: 00007f40938086d0 CR3: 00000001197c0000 CR4: 00000000003526e0
[ 226.922684] Kernel panic - not syncing: Fatal exception in interrupt


On Tue, Jul 4, 2023 at 12:10 PM Paolo Abeni <[email protected]> wrote:
>
> On Mon, 2023-07-03 at 11:37 +0200, Ian Kumlien wrote:
> > So, got back, switched to 6.4.1 and reran with kmemleak and kasan
> >
> > I got the splat from:
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index cea28d30abb5..701c1b5cf532 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -4328,6 +4328,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb->prev = tail;
> >
> > + if (WARN_ON_ONCE(!skb->next))
> > + goto err_linearize;
> > +
> > if (skb_needs_linearize(skb, features) &&
> > __skb_linearize(skb))
> > goto err_linearize;
> >
> > I'm just happy i ran with dmesg -W since there was only minimal output
> > on the console:
> > [39914.833696] rcu: INFO: rcu_preempt self-detected stall on CPU
> > [39914.839598] rcu: 2-....: (20997 ticks this GP)
> > idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=4687
> > [39914.849839] rcu: (t=21017 jiffies g=18175157 q=45473 ncpus=12)
> > [39977.862108] rcu: INFO: rcu_preempt self-detected stall on CPU
> > [39977.868002] rcu: 2-....: (84001 ticks this GP)
> > idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=28434
> > [39977.878340] rcu: (t=84047 jiffies g=18175157 q=263477 ncpus=12)
> > [40040.892521] rcu: INFO: rcu_preempt self-detected stall on CPU
> > [40040.898414] rcu: 2-....: (147006 ticks this GP)
> > idle=dd64/1/0x4000000000000000 softirq=4633489/4633489 fqs=53043
> > [40040.908831] rcu: (t=147079 jiffies g=18175157 q=464422 ncpus=12)
> > [40065.080842] ixgbe 0000:06:00.1 eno2: Reset adapter
>
> Ouch, just another slightly different issue, apparently :(
>
> I'll try some wild guesses. The rcu stall could cause the OOM observed
> in the previous tests. Here we the OOM did not trigger because due to
> kasan/kmemleak the kernel is able to process a lesser number of packets
> in the same period of time.
>
> [...]
> > [39914.857231] skb_segment (net/core/skbuff.c:4519)
>
> I *think* this could be looping "forever", if gso_size becomes 0, which
> is in turn completely unexpected ...
>
>
> > [39914.857257] ? write_profile (kernel/stacktrace.c:83)
> > [39914.857296] ? pskb_extract (net/core/skbuff.c:4360)
> > [39914.857320] ? rt6_score_route (net/ipv6/route.c:713 (discriminator 1))
> > [39914.857346] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
> > [39914.857379] __udp_gso_segment (net/ipv4/udp_offload.c:290)
> > [39914.857413] ? ip6_dst_destroy (net/ipv6/route.c:788)
> > [39914.857442] udp6_ufo_fragment (net/ipv6/udp_offload.c:47)
> > [39914.857472] ? udp6_gro_complete (net/ipv6/udp_offload.c:20)
> > [39914.857498] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:53)
> > [39914.857528] ipv6_gso_segment (net/ipv6/ip6_offload.c:119
> > net/ipv6/ip6_offload.c:74)
> > [39914.857557] ? ipv6_gso_pull_exthdrs (net/ipv6/ip6_offload.c:76)
> > [39914.857583] ? nft_update_chain_stats (net/netfilter/nf_tables_core.c:254)
> > [39914.857612] ? fib6_select_path (net/ipv6/route.c:458)
> > [39914.857643] skb_mac_gso_segment (net/core/gro.c:141)
> > [39914.857673] ? skb_eth_gso_segment (net/core/gro.c:127)
> > [39914.857702] ? ipv6_skip_exthdr (net/ipv6/exthdrs_core.c:190)
> > [39914.857726] ? kasan_save_stack (mm/kasan/common.c:47)
> > [39914.857758] __skb_gso_segment (net/core/dev.c:3401 (discriminator 2))
> > [39914.857787] udpv6_queue_rcv_skb (./include/net/udp.h:492
> > net/ipv6/udp.c:796 net/ipv6/udp.c:787)
> > [39914.857816] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
>
> ... but this means we are processing a multicast packet, likely skb is
> cloned. If one of the clone instance enters simultaneusly
> skb_segment_list() the latter would inconditionally call:
>
> skb_gso_reset(skb);
>
> clearing the gso area in the shared info and causing unexpected results
> (possibly the memory corruption observed before, and the above RCU
> stall) for the other clone instances.
>
> Assuming there are no other issues and that the above is not just a
> side effect of ENOCOFFEE here, the following should possibly solve,
> could you please add it to your testbed? (still with kasan+previous
> patch, kmemleak is possibly not needed).
>
> Thanks!
> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..ac1ca6c7bff9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4263,6 +4263,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb_shinfo(skb)->frag_list = NULL;
>
> + /* later code will clear the gso area in the shared info */
> + err = skb_header_unclone(skb, GFP_ATOMIC);
> + if (err)
> + goto err_linearize;
> +
> while (list_skb) {
> nskb = list_skb;
> list_skb = list_skb->next;
>

2023-07-04 13:04:10

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> Propper bug this time:
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux

To be sure, is this with the last patch I shared? this one I mean:

https://lore.kernel.org/netdev/[email protected]/

Could you please additionally enable CONFIG_DEBUG_NET in your build?

Could you please give a detailed description of your network topology
and the running traffic?

Thanks!

Paolo


2023-07-04 13:25:35

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, Jul 4, 2023 at 2:54 PM Paolo Abeni <[email protected]> wrote:
>
> On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> > Propper bug this time:
> > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
>
> To be sure, is this with the last patch I shared? this one I mean:

The current modifications I have, on top of v6.4.1, is:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cea28d30abb5..8552caa197f9 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4272,6 +4272,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb_shinfo(skb)->frag_list = NULL;

+ /* later code will clear the gso area in the shared info */
+ err = skb_header_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
while (list_skb) {
nskb = list_skb;
list_skb = list_skb->next;
@@ -4328,6 +4333,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb->prev = tail;

+ if (WARN_ON_ONCE(!skb->next))
+ goto err_linearize;
+
if (skb_needs_linearize(skb, features) &&
__skb_linearize(skb))
goto err_linearize;
---

> https://lore.kernel.org/netdev/[email protected]/
>
> Could you please additionally enable CONFIG_DEBUG_NET in your build?

Sure, will do

> Could you please give a detailed description of your network topology
> and the running traffic?

This machine has two "real interfaces" and two interfaces that runs as
bridges for virtual machines
eno1 - real internal
eno2 - bridge - internal
eno3 - real external
eno4 - bridge - external

The bridges are used by three virtual machines, two of which are
attached on both networks

Traffic seemed to be video streaming, at least at first, now I don't
really know. I do have a few smart devices so I assume there is
a bit of multicast traffic as well - but not really anything unusual as such.

> Thanks!
>
> Paolo
>

2023-07-04 13:43:32

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, 2023-07-04 at 15:23 +0200, Ian Kumlien wrote:
> On Tue, Jul 4, 2023 at 2:54 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> > > Propper bug this time:
> > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> >
> > To be sure, is this with the last patch I shared? this one I mean:
>
> The current modifications I have, on top of v6.4.1, is:
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index cea28d30abb5..8552caa197f9 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4272,6 +4272,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb_shinfo(skb)->frag_list = NULL;
>
> + /* later code will clear the gso area in the shared info */
> + err = skb_header_unclone(skb, GFP_ATOMIC);
> + if (err)
> + goto err_linearize;
> +
> while (list_skb) {
> nskb = list_skb;
> list_skb = list_skb->next;
> @@ -4328,6 +4333,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb->prev = tail;
>
> + if (WARN_ON_ONCE(!skb->next))
> + goto err_linearize;
> +
> if (skb_needs_linearize(skb, features) &&
> __skb_linearize(skb))
> goto err_linearize;
> ---
>
> > https://lore.kernel.org/netdev/[email protected]/
> >
> > Could you please additionally enable CONFIG_DEBUG_NET in your build?
>
> Sure, will do
>
> > Could you please give a detailed description of your network topology
> > and the running traffic?
>
> This machine has two "real interfaces" and two interfaces that runs as
> bridges for virtual machines
> eno1 - real internal
> eno2 - bridge - internal
> eno3 - real external
> eno4 - bridge - external
>
> The bridges are used by three virtual machines, two of which are
> attached on both networks
>
> Traffic seemed to be video streaming, at least at first, now I don't
> really know. I do have a few smart devices so I assume there is
> a bit of multicast traffic as well - but not really anything unusual as such.

In there any XDP program running on the host side? Possibly changing
the packet hdr?

Thanks!

/P


2023-07-04 14:23:17

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, Jul 4, 2023 at 3:41 PM Paolo Abeni <[email protected]> wrote:
>
> On Tue, 2023-07-04 at 15:23 +0200, Ian Kumlien wrote:
> > On Tue, Jul 4, 2023 at 2:54 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> > > > Propper bug this time:
> > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > >
> > > To be sure, is this with the last patch I shared? this one I mean:
> >
> > The current modifications I have, on top of v6.4.1, is:
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index cea28d30abb5..8552caa197f9 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -4272,6 +4272,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb_shinfo(skb)->frag_list = NULL;
> >
> > + /* later code will clear the gso area in the shared info */
> > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > + if (err)
> > + goto err_linearize;
> > +
> > while (list_skb) {
> > nskb = list_skb;
> > list_skb = list_skb->next;
> > @@ -4328,6 +4333,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb->prev = tail;
> >
> > + if (WARN_ON_ONCE(!skb->next))
> > + goto err_linearize;
> > +
> > if (skb_needs_linearize(skb, features) &&
> > __skb_linearize(skb))
> > goto err_linearize;
> > ---
> >
> > > https://lore.kernel.org/netdev/[email protected]/
> > >
> > > Could you please additionally enable CONFIG_DEBUG_NET in your build?
> >
> > Sure, will do
> >
> > > Could you please give a detailed description of your network topology
> > > and the running traffic?
> >
> > This machine has two "real interfaces" and two interfaces that runs as
> > bridges for virtual machines
> > eno1 - real internal
> > eno2 - bridge - internal
> > eno3 - real external
> > eno4 - bridge - external
> >
> > The bridges are used by three virtual machines, two of which are
> > attached on both networks
> >
> > Traffic seemed to be video streaming, at least at first, now I don't
> > really know. I do have a few smart devices so I assume there is
> > a bit of multicast traffic as well - but not really anything unusual as such.
>
> In there any XDP program running on the host side? Possibly changing
> the packet hdr?

Only systemd standard things, I haven't done anything and the normal
nftables fw doesn't do anything special

> Thanks!
>
> /P
>

2023-07-04 14:35:01

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

More stacktraces.. =)

cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[ 411.413767] ------------[ cut here ]------------
[ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/udp.h:509
udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
net/ipv6/udp.c:787)
[ 411.413829] Modules linked in: chaoskey
[ 411.413857] CPU: 9 PID: 942 Comm: napi/eno2-87 Tainted: G W
6.4.1-dirty #375
[ 411.413879] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 411.413891] RIP: 0010:udpv6_queue_rcv_skb (./include/net/udp.h:509
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.413912] Code: 70 48 c7 c7 80 77 e6 b5 44 89 c6 e8 3e 72 ef fc 31
d2 48 89 de 48 c7 c7 c0 77 e6 b5 e8 4d e2 74 ff 0f 0b 0f 0b e9 51 fd
ff ff <0f> 0b e9 1c fe ff ff 48 b8 00 00 00 00 00 fc ff df 4c 89 f2 48
c1
All code
========
0: 70 48 jo 0x4a
2: c7 c7 80 77 e6 b5 mov $0xb5e67780,%edi
8: 44 89 c6 mov %r8d,%esi
b: e8 3e 72 ef fc call 0xfffffffffcef724e
10: 31 d2 xor %edx,%edx
12: 48 89 de mov %rbx,%rsi
15: 48 c7 c7 c0 77 e6 b5 mov $0xffffffffb5e677c0,%rdi
1c: e8 4d e2 74 ff call 0xffffffffff74e26e
21: 0f 0b ud2
23: 0f 0b ud2
25: e9 51 fd ff ff jmp 0xfffffffffffffd7b
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 1c fe ff ff jmp 0xfffffffffffffe4d
31: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
38: fc ff df
3b: 4c 89 f2 mov %r14,%rdx
3e: 48 rex.W
3f: c1 .byte 0xc1

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 1c fe ff ff jmp 0xfffffffffffffe23
7: 48 b8 00 00 00 00 00 movabs $0xdffffc0000000000,%rax
e: fc ff df
11: 4c 89 f2 mov %r14,%rdx
14: 48 rex.W
15: c1 .byte 0xc1
[ 411.413930] RSP: 0018:ffff888112cd6e68 EFLAGS: 00010202
[ 411.413949] RAX: 0000000000000000 RBX: ffff88811b52b7c0 RCX: 0000000000000002
[ 411.413963] RDX: 0000000000000055 RSI: 000000000000008b RDI: ffff88811b52b802
[ 411.413976] RBP: 0000000000000036 R08: 0000000000000036 R09: ffff88813ac9d0e7
[ 411.413988] R10: ffffed1027593a1c R11: 75d99cb20000000f R12: ffff88813ac9d080
[ 411.414001] R13: dffffc0000000000 R14: ffff8881a34dc0f6 R15: 0000000000000000
[ 411.414014] FS: 0000000000000000(0000) GS:ffff8883ef680000(0000)
knlGS:0000000000000000
[ 411.414029] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 411.414084] CR2: 0000562e17b4bd98 CR3: 00000001188b2000 CR4: 00000000003526e0
[ 411.414103] Call Trace:
[ 411.414114] <TASK>
[ 411.414126] ? __warn (kernel/panic.c:673)
[ 411.414149] ? udpv6_queue_rcv_skb (./include/net/udp.h:509
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.414168] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 411.414189] ? handle_bug (arch/x86/kernel/traps.c:324)
[ 411.414209] ? exc_invalid_op (arch/x86/kernel/traps.c:345 (discriminator 1))
[ 411.414229] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 411.414255] ? udpv6_queue_rcv_skb (./include/net/udp.h:509
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.414274] ? udpv6_queue_rcv_skb (net/ipv6/udp.c:801 net/ipv6/udp.c:787)
[ 411.414293] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[ 411.414314] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 411.414336] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:439
(discriminator 4))
[ 411.414363] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[ 411.414384] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[ 411.414403] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[ 411.414423] ? fib6_select_path (./include/net/nexthop.h:515
net/ipv6/route.c:435)
[ 411.414443] ? ip6_protocol_deliver_rcu
(./include/linux/skbuff.h:4180 net/ipv6/ip6_input.c:480)
[ 411.414465] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 411.414486] ip6_mc_input (net/ipv6/ip6_input.c:591)
[ 411.414507] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[ 411.414529] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[ 411.414549] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 411.414567] ? kthread (kernel/kthread.c:379)
[ 411.414587] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 411.414605] ? ipv6_gro_complete (net/ipv6/ip6_offload.c:357)
[ 411.414623] ? ipv6_list_rcv (./include/net/l3mdev.h:169
./include/net/l3mdev.h:190 net/ipv6/ip6_input.c:74)
[ 411.414644] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 411.414666] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 411.414685] __netif_receive_skb_one_core (net/core/dev.c:5486)
[ 411.414709] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[ 411.414730] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 411.414754] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[ 411.414774] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[ 411.414795] ? __netif_receive_skb (net/core/dev.c:5747)
[ 411.414815] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[ 411.414835] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 411.414857] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 411.414879] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 411.414900] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[ 411.414921] ? br_multicast_flood (net/bridge/br_forward.c:126
net/bridge/br_forward.c:336)
[ 411.414943] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 411.414964] ? br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 411.414986] br_handle_frame_finish (net/bridge/br_input.c:216)
[ 411.415009] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.415031] ? br_nf_post_routing
(net/bridge/br_netfilter_hooks.c:116
net/bridge/br_netfilter_hooks.c:837)
[ 411.415070] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 411.415093] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.415115] ? br_nf_post_routing
(net/bridge/br_netfilter_hooks.c:116
net/bridge/br_netfilter_hooks.c:837)
[ 411.415135] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 411.415155] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 411.415178] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 411.415201] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.415222] ? packet_rcv (net/packet/af_packet.c:2231)
[ 411.415242] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 411.415265] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 411.415288] ? do_xdp_generic (net/core/dev.c:5281)
[ 411.415309] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 411.415332] ? udp6_lib_lookup2 (net/ipv6/udp.c:199)
[ 411.415352] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 411.415376] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 411.415398] ? __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 411.415419] ? ipv6_portaddr_hash.isra.0 (net/ipv4/inet_hashtables.c:300)
[ 411.415438] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 411.415459] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 411.415481] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 411.415504] ? process_backlog (net/core/dev.c:5699)
[ 411.415524] ? netif_receive_skb_list_internal (net/core/dev.c:5699)
[ 411.415547] ? ipv6_gro_receive (net/ipv6/ip6_offload.c:281 (discriminator 7))
[ 411.415565] ? dev_close_many (net/core/dev.c:1516)
[ 411.415585] napi_gro_complete.constprop.0
(./include/linux/list.h:37 ./include/net/gro.h:434
./include/net/gro.h:446 net/core/gro.c:328)
[ 411.415608] dev_gro_receive (net/core/gro.c:553)
[ 411.415631] napi_gro_receive (net/core/gro.c:657)
[ 411.415651] ixgbe_poll
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2420
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
[ 411.415678] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 411.415702] ? __napi_poll (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
net/core/dev.c:6497)
[ 411.415723] __napi_poll (net/core/dev.c:6498)
[ 411.415743] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 411.415766] ? __napi_poll (net/core/dev.c:6625)
[ 411.415785] ? migrate_enable (kernel/sched/core.c:3045)
[ 411.415808] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 411.415827] ? __napi_poll (net/core/dev.c:6625)
[ 411.415847] kthread (kernel/kthread.c:379)
[ 411.415866] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 411.415887] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 411.415908] </TASK>
[ 411.415919] ---[ end trace 0000000000000000 ]---
[ 411.415932] ==================================================================
[ 411.423277] BUG: KASAN: slab-use-after-free in udpv6_queue_rcv_skb
(./include/net/udp.h:524 net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.430624] Write of size 2 at addr ffff88811b52b800 by task napi/eno2-87/942

[ 411.439464] CPU: 9 PID: 942 Comm: napi/eno2-87 Tainted: G W
6.4.1-dirty #375
[ 411.448024] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 411.456225] Call Trace:
[ 411.458772] <TASK>
[ 411.460973] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1))
[ 411.464754] print_report (mm/kasan/report.c:352 mm/kasan/report.c:462)
[ 411.468455] ? udpv6_queue_rcv_skb (./include/net/udp.h:524
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.473012] kasan_report (mm/kasan/report.c:574)
[ 411.476618] ? udpv6_queue_rcv_skb (./include/net/udp.h:524
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.481175] udpv6_queue_rcv_skb (./include/net/udp.h:524
net/ipv6/udp.c:800 net/ipv6/udp.c:787)
[ 411.485561] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[ 411.489688] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 411.494254] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:439
(discriminator 4))
[ 411.499189] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[ 411.503240] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[ 411.506688] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[ 411.511006] ? fib6_select_path (./include/net/nexthop.h:515
net/ipv6/route.c:435)
[ 411.515320] ? ip6_protocol_deliver_rcu
(./include/linux/skbuff.h:4180 net/ipv6/ip6_input.c:480)
[ 411.520504] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 411.525079] ip6_mc_input (net/ipv6/ip6_input.c:591)
[ 411.528877] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[ 411.533020] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[ 411.536468] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 411.540174] ? kthread (kernel/kthread.c:379)
[ 411.543703] ? ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 411.547575] ? ipv6_gro_complete (net/ipv6/ip6_offload.c:357)
[ 411.551980] ? ipv6_list_rcv (./include/net/l3mdev.h:169
./include/net/l3mdev.h:190 net/ipv6/ip6_input.c:74)
[ 411.556036] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 411.560871] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 411.564580] __netif_receive_skb_one_core (net/core/dev.c:5486)
[ 411.569765] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[ 411.575213] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 411.579098] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[ 411.583414] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[ 411.587641] ? __netif_receive_skb (net/core/dev.c:5747)
[ 411.592218] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[ 411.598102] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 411.602618] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 411.606502] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 411.610645] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[ 411.615133] ? br_multicast_flood (net/bridge/br_forward.c:126
net/bridge/br_forward.c:336)
[ 411.619708] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 411.624545] ? br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 411.628861] br_handle_frame_finish (net/bridge/br_input.c:216)
[ 411.633610] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.638273] ? br_nf_post_routing
(net/bridge/br_netfilter_hooks.c:116
net/bridge/br_netfilter_hooks.c:837)
[ 411.642849] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 411.648030] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.652693] ? br_nf_post_routing
(net/bridge/br_netfilter_hooks.c:116
net/bridge/br_netfilter_hooks.c:837)
[ 411.657270] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 411.662451] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 411.666596] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 411.671612] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 411.676274] ? packet_rcv (net/packet/af_packet.c:2231)
[ 411.680148] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 411.686119] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 411.691129] ? do_xdp_generic (net/core/dev.c:5281)
[ 411.695273] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 411.701417] ? udp6_lib_lookup2 (net/ipv6/udp.c:199)
[ 411.705732] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 411.711004] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 411.717232] ? __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 411.722674] ? ipv6_portaddr_hash.isra.0 (net/ipv4/inet_hashtables.c:300)
[ 411.727770] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 411.732172] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 411.736834] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 411.742281] ? process_backlog (net/core/dev.c:5699)
[ 411.746506] ? netif_receive_skb_list_internal (net/core/dev.c:5699)
[ 411.752122] ? ipv6_gro_receive (net/ipv6/ip6_offload.c:281 (discriminator 7))
[ 411.756526] ? dev_close_many (net/core/dev.c:1516)
[ 411.760582] napi_gro_complete.constprop.0
(./include/linux/list.h:37 ./include/net/gro.h:434
./include/net/gro.h:446 net/core/gro.c:328)
[ 411.765853] dev_gro_receive (net/core/gro.c:553)
[ 411.770084] napi_gro_receive (net/core/gro.c:657)
[ 411.774224] ixgbe_poll
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2420
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
[ 411.778026] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 411.783644] ? __napi_poll (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
net/core/dev.c:6497)
[ 411.787440] __napi_poll (net/core/dev.c:6498)
[ 411.791064] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 411.795381] ? __napi_poll (net/core/dev.c:6625)
[ 411.799258] ? migrate_enable (kernel/sched/core.c:3045)
[ 411.803404] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 411.807631] ? __napi_poll (net/core/dev.c:6625)
[ 411.811515] kthread (kernel/kthread.c:379)
[ 411.814876] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 411.819800] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 411.823505] </TASK>

[ 411.827426] Allocated by task 933:
[ 411.830945] kasan_save_stack (mm/kasan/common.c:46)
[ 411.830969] kasan_set_track (mm/kasan/common.c:52)
[ 411.830987] __kasan_slab_alloc (mm/kasan/common.c:328)
[ 411.831006] kmem_cache_alloc_bulk (mm/slab.h:711 mm/slub.c:4033)
[ 411.831025] napi_skb_cache_get (net/core/skbuff.c:338)
[ 411.831044] __napi_build_skb (net/core/skbuff.c:517)
[ 411.831062] napi_build_skb (net/core/skbuff.c:539)
[ 411.831080] ixgbe_poll
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2165
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:2361
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3178)
[ 411.831097] __napi_poll (net/core/dev.c:6498)
[ 411.831114] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 411.831131] kthread (kernel/kthread.c:379)
[ 411.831148] ret_from_fork (arch/x86/entry/entry_64.S:314)

[ 411.832772] Freed by task 996:
[ 411.835945] kasan_save_stack (mm/kasan/common.c:46)
[ 411.835965] kasan_set_track (mm/kasan/common.c:52)
[ 411.835984] kasan_save_free_info (mm/kasan/generic.c:523)
[ 411.836000] __kasan_slab_free (mm/kasan/common.c:238
mm/kasan/common.c:200 mm/kasan/common.c:244)
[ 411.836019] kmem_cache_free (mm/slub.c:1807 mm/slub.c:3786 mm/slub.c:3808)
[ 411.836036] tun_do_read (drivers/net/tun.c:2242)
[ 411.836054] tun_recvmsg (drivers/net/tun.c:2624)
[ 411.836070] handle_rx (drivers/vhost/net.c:1213)
[ 411.836089] vhost_worker (./include/linux/sched.h:2086
(discriminator 3) drivers/vhost/vhost.c:354 (discriminator 3))
[ 411.836104] vhost_task_fn (kernel/vhost_task.c:56)
[ 411.836122] ret_from_fork (arch/x86/entry/entry_64.S:314)

[ 411.837746] The buggy address belongs to the object at ffff88811b52b7c0
which belongs to the cache skbuff_head_cache of size 224
[ 411.850937] The buggy address is located 64 bytes inside of
freed 224-byte region [ffff88811b52b7c0, ffff88811b52b8a0)

[ 411.864866] The buggy address belongs to the physical page:
[ 411.870561] page:00000000e6f9dafd refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x11b52a
[ 411.870582] head:00000000e6f9dafd order:1 entire_mapcount:0
nr_pages_mapped:0 pincount:0
[ 411.870598] flags: 0x8000000000010200(slab|head|zone=2)
[ 411.870616] page_type: 0xffffffff()
[ 411.870634] raw: 8000000000010200 ffff88810033adc0 dead000000000122
0000000000000000
[ 411.870650] raw: 0000000000000000 0000000000190019 00000001ffffffff
0000000000000000
[ 411.870661] page dumped because: kasan: bad access detected

[ 411.872284] Memory state around the buggy address:
[ 411.877201] ffff88811b52b700: fb fb fb fb fb fb fb fb fb fb fb fb
fc fc fc fc
[ 411.884566] ffff88811b52b780: fc fc fc fc fc fc fc fc fa fb fb fb
fb fb fb fb
[ 411.891933] >ffff88811b52b800: fb fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 411.899295] ^
[ 411.902644] ffff88811b52b880: fb fb fb fb fc fc fc fc fc fc fc fc
fc fc fc fc
[ 411.910011] ffff88811b52b900: fa fb fb fb fb fb fb fb fb fb fb fb
fb fb fb fb
[ 411.917373] ==================================================================
[ 411.924832] Disabling lock debugging due to kernel taint

On Tue, Jul 4, 2023 at 4:06 PM Ian Kumlien <[email protected]> wrote:
>
> On Tue, Jul 4, 2023 at 3:41 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Tue, 2023-07-04 at 15:23 +0200, Ian Kumlien wrote:
> > > On Tue, Jul 4, 2023 at 2:54 PM Paolo Abeni <[email protected]> wrote:
> > > >
> > > > On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> > > > > Propper bug this time:
> > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > >
> > > > To be sure, is this with the last patch I shared? this one I mean:
> > >
> > > The current modifications I have, on top of v6.4.1, is:
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index cea28d30abb5..8552caa197f9 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -4272,6 +4272,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > >
> > > skb_shinfo(skb)->frag_list = NULL;
> > >
> > > + /* later code will clear the gso area in the shared info */
> > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > + if (err)
> > > + goto err_linearize;
> > > +
> > > while (list_skb) {
> > > nskb = list_skb;
> > > list_skb = list_skb->next;
> > > @@ -4328,6 +4333,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > >
> > > skb->prev = tail;
> > >
> > > + if (WARN_ON_ONCE(!skb->next))
> > > + goto err_linearize;
> > > +
> > > if (skb_needs_linearize(skb, features) &&
> > > __skb_linearize(skb))
> > > goto err_linearize;
> > > ---
> > >
> > > > https://lore.kernel.org/netdev/[email protected]/
> > > >
> > > > Could you please additionally enable CONFIG_DEBUG_NET in your build?
> > >
> > > Sure, will do
> > >
> > > > Could you please give a detailed description of your network topology
> > > > and the running traffic?
> > >
> > > This machine has two "real interfaces" and two interfaces that runs as
> > > bridges for virtual machines
> > > eno1 - real internal
> > > eno2 - bridge - internal
> > > eno3 - real external
> > > eno4 - bridge - external
> > >
> > > The bridges are used by three virtual machines, two of which are
> > > attached on both networks
> > >
> > > Traffic seemed to be video streaming, at least at first, now I don't
> > > really know. I do have a few smart devices so I assume there is
> > > a bit of multicast traffic as well - but not really anything unusual as such.
> >
> > In there any XDP program running on the host side? Possibly changing
> > the packet hdr?
>
> Only systemd standard things, I haven't done anything and the normal
> nftables fw doesn't do anything special
>
> > Thanks!
> >
> > /P
> >

2023-07-04 15:09:00

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

And I kept it running, to see if we could end up with the same bug as
last time and it seems we did but now with more info =)

cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
[ 2061.424044] __skb_pull(len=54)
[ 2061.427257] skb len=91 headroom=192 headlen=153 tailroom=0
mac=(192,14) net=(206,40) trans=246
shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0))
csum(0x0 ip_summed=1 complete_sw=0 valid=1 level=0)
hash(0x75d99cb2 sw=0 l4=0) proto=0x86dd pkttype=2 iif=15
[ 2061.455894] dev name=local-lan feat=0x00002003bfdd78e9
[ 2061.461179] skb linear: 00000000: 33 33 00 00 00 fb c8 7f 54 b1
7d f8 86 dd 60 05
[ 2061.468982] skb linear: 00000010: 1d 4c 00 25 11 ff fe 80 00 00
00 00 00 00 ca 7f
[ 2061.476788] skb linear: 00000020: 54 ff fe b1 7d f8 ff 02 00 00
00 00 00 00 00 00
[ 2061.484585] skb linear: 00000030: 00 00 00 00 00 fb 14 e9 14 e9
00 63 90 15 00 00
[ 2061.492382] skb linear: 00000040: 00 00 00 01 00 00 00 00 00 00
32 53 48 49 45 4c
[ 2061.500180] skb linear: 00000050: 44 2d 41 6e 64 72 6f 69 64 2d 54
[ 2061.506809] ------------[ cut here ]------------
[ 2061.506822] kernel BUG at include/linux/skbuff.h:2645!
[ 2061.512109] invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI
[ 2061.517979] CPU: 4 PID: 942 Comm: napi/eno2-87 Tainted: G B W
6.4.1-dirty #375
[ 2061.526572] Hardware name: Supermicro Super Server/A2SDi-12C-HLN4F,
BIOS 1.7a 10/13/2022
[ 2061.534807] RIP: 0010:udpv6_queue_rcv_skb
(./include/linux/skbuff.h:2645 net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2061.539997] Code: 40 fd e9 f6 fb ff ff 89 73 70 48 c7 c7 80 77 e6
b5 44 89 c6 e8 3e 72 ef fc 31 d2 48 89 de 48 c7 c7 c0 77 e6 b5 e8 4d
e2 74 ff <0f> 0b 0f 0b e9 51 fd ff ff 0f 0b e9 1c fe ff ff 48 b8 00 00
00 00
All code
========
0: 40 fd rex std
2: e9 f6 fb ff ff jmp 0xfffffffffffffbfd
7: 89 73 70 mov %esi,0x70(%rbx)
a: 48 c7 c7 80 77 e6 b5 mov $0xffffffffb5e67780,%rdi
11: 44 89 c6 mov %r8d,%esi
14: e8 3e 72 ef fc call 0xfffffffffcef7257
19: 31 d2 xor %edx,%edx
1b: 48 89 de mov %rbx,%rsi
1e: 48 c7 c7 c0 77 e6 b5 mov $0xffffffffb5e677c0,%rdi
25: e8 4d e2 74 ff call 0xffffffffff74e277
2a:* 0f 0b ud2 <-- trapping instruction
2c: 0f 0b ud2
2e: e9 51 fd ff ff jmp 0xfffffffffffffd84
33: 0f 0b ud2
35: e9 1c fe ff ff jmp 0xfffffffffffffe56
3a: 48 rex.W
3b: b8 00 00 00 00 mov $0x0,%eax

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 0f 0b ud2
4: e9 51 fd ff ff jmp 0xfffffffffffffd5a
9: 0f 0b ud2
b: e9 1c fe ff ff jmp 0xfffffffffffffe2c
10: 48 rex.W
11: b8 00 00 00 00 mov $0x0,%eax
[ 2061.558949] RSP: 0018:ffff888112cd6ef0 EFLAGS: 00010286
[ 2061.564307] RAX: 0000000000000000 RBX: ffff88812623d040 RCX: ffffffffb318313a
[ 2061.571567] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888112cd6ab8
[ 2061.578832] RBP: 0000000000000036 R08: 0000000000000001 R09: ffff888112cd6abf
[ 2061.586094] R10: ffffed102259ad57 R11: 000000000000005b R12: ffff88813ac9d080
[ 2061.593356] R13: dffffc0000000000 R14: ffff8881156cc0c0 R15: ffff88810c287540
[ 2061.600619] FS: 0000000000000000(0000) GS:ffff8883ef400000(0000)
knlGS:0000000000000000
[ 2061.608861] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2061.614737] CR2: 00007ff4c106066c CR3: 000000011c2f8000 CR4: 00000000003526e0
[ 2061.621999] Call Trace:
[ 2061.624568] <TASK>
[ 2061.626783] ? die (arch/x86/kernel/dumpstack.c:421
arch/x86/kernel/dumpstack.c:434 arch/x86/kernel/dumpstack.c:447)
[ 2061.629799] ? do_trap (arch/x86/kernel/traps.c:124
arch/x86/kernel/traps.c:165)
[ 2061.633326] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2061.637905] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2061.642479] ? handle_invalid_op (arch/x86/kernel/traps.c:88
arch/x86/kernel/traps.c:186 arch/x86/kernel/traps.c:297)
[ 2061.646708] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2061.651283] ? exc_invalid_op (arch/x86/kernel/traps.c:352)
[ 2061.655253] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 2061.659571] ? llist_add_batch (lib/llist.c:33 (discriminator 14))
[ 2061.663713] ? udpv6_queue_rcv_skb (./include/linux/skbuff.h:2645
net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2061.668290] __udp6_lib_rcv (net/ipv6/udp.c:906 net/ipv6/udp.c:1013)
[ 2061.672433] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 2061.677006] ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:439
(discriminator 4))
[ 2061.681932] ip6_input_finish (./include/linux/rcupdate.h:805
net/ipv6/ip6_input.c:483)
[ 2061.685986] ip6_input (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:491)
[ 2061.689434] ? ip6_input_finish (net/ipv6/ip6_input.c:490)
[ 2061.693749] ? fib6_select_path (./include/net/nexthop.h:515
net/ipv6/route.c:435)
[ 2061.698067] ? ip6_protocol_deliver_rcu
(./include/linux/skbuff.h:4180 net/ipv6/ip6_input.c:480)
[ 2061.703256] ? ipv6_chk_mcast_addr (net/ipv6/mcast.c:1048)
[ 2061.707834] ip6_mc_input (net/ipv6/ip6_input.c:591)
[ 2061.711629] ? ip6_rcv_finish (net/ipv6/ip6_input.c:498)
[ 2061.715775] ipv6_rcv (./include/net/dst.h:468
net/ipv6/ip6_input.c:79 ./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/ipv6/ip6_input.c:309)
[ 2061.719222] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 2061.722928] ? ipv6_mc_validate_checksum (net/ipv6/mcast_snoop.c:173)
[ 2061.728025] ? ipv6_list_rcv (./include/net/l3mdev.h:169
./include/net/l3mdev.h:190 net/ipv6/ip6_input.c:74)
[ 2061.732081] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:55)
[ 2061.736918] ? ip6_input (net/ipv6/ip6_input.c:303)
[ 2061.740627] __netif_receive_skb_one_core (net/core/dev.c:5486)
[ 2061.745812] ? __netif_receive_skb_list_core (net/core/dev.c:5486)
[ 2061.751260] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 2061.755144] ? br_forward_finish (./include/linux/netfilter.h:303
./include/linux/netfilter.h:297 net/bridge/br_forward.c:66)
[ 2061.759459] netif_receive_skb (net/core/dev.c:5693 net/core/dev.c:5752)
[ 2061.763687] ? __netif_receive_skb (net/core/dev.c:5747)
[ 2061.768265] ? br_multicast_set_startup_query_intvl
(net/bridge/br_multicast.c:5014)
[ 2061.774148] ? br_fdb_offloaded_set (net/bridge/br_forward.c:34)
[ 2061.778637] ? nf_hook_slow (./include/linux/netfilter.h:143
net/netfilter/core.c:626)
[ 2061.782521] br_pass_frame_up (net/bridge/br_input.c:30
./include/linux/netfilter.h:303 ./include/linux/netfilter.h:297
net/bridge/br_input.c:68)
[ 2061.786663] ? br_netif_receive_skb (net/bridge/br_input.c:34)
[ 2061.791153] ? br_multicast_flood (net/bridge/br_forward.c:126
net/bridge/br_forward.c:336)
[ 2061.795728] ? br_dev_queue_push_xmit (net/bridge/br_forward.c:64)
[ 2061.800566] br_handle_frame_finish (net/bridge/br_input.c:216)
[ 2061.805314] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 2061.809977] ? br_nf_post_routing
(net/bridge/br_netfilter_hooks.c:116
net/bridge/br_netfilter_hooks.c:837)
[ 2061.814553] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 2061.819733] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 2061.824916] ? br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 2061.829232] ? sysvec_apic_timer_interrupt (arch/x86/kernel/apic/apic.c:1106)
[ 2061.834330] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 2061.839338] ? br_nf_pre_routing_finish (net/bridge/br_netfilter_hooks.c:481)
[ 2061.844522] br_handle_frame (net/bridge/br_input.c:298
net/bridge/br_input.c:416)
[ 2061.848742] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 2061.853757] ? reuseport_select_sock (net/core/sock_reuseport.c:573)
[ 2061.858508] ? br_handle_local_finish (net/bridge/br_input.c:75)
[ 2061.863170] ? packet_rcv (net/packet/af_packet.c:2231)
[ 2061.867045] __netif_receive_skb_core.constprop.0 (net/core/dev.c:5387)
[ 2061.873017] ? br_handle_frame_finish (net/bridge/br_input.c:321)
[ 2061.878026] ? udp6_lib_lookup2 (net/ipv6/udp.c:199)
[ 2061.882339] ? do_xdp_generic (net/core/dev.c:5281)
[ 2061.886484] ? __udp6_lib_lookup (net/ipv6/udp.c:276)
[ 2061.890889] __netif_receive_skb_list_core (net/core/dev.c:5570)
[ 2061.896165] ? ipv6_portaddr_hash.isra.0 (net/ipv4/inet_hashtables.c:300)
[ 2061.901260] ? udp_push_pending_frames (net/ipv4/udp.c:495)
[ 2061.906010] ? __napi_poll (net/core/dev.c:6498)
[ 2061.909805] ? napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 2061.914294] ? kthread (kernel/kthread.c:379)
[ 2061.917830] ? __netif_receive_skb_core.constprop.0 (net/core/dev.c:5546)
[ 2061.924065] ? recalibrate_cpu_khz (./arch/x86/include/asm/msr.h:215
arch/x86/kernel/tsc.c:1110)
[ 2061.928464] ? ktime_get_with_offset (kernel/time/timekeeping.c:292
(discriminator 3) kernel/time/timekeeping.c:388 (discriminator 3)
kernel/time/timekeeping.c:891 (discriminator 3))
[ 2061.933128] netif_receive_skb_list_internal (net/core/dev.c:5638
net/core/dev.c:5727)
[ 2061.938580] ? ipv6_gro_receive (net/ipv6/ip6_offload.c:281 (discriminator 7))
[ 2061.942980] ? process_backlog (net/core/dev.c:5699)
[ 2061.947212] ? napi_gro_complete.constprop.0 (net/core/gro.c:321)
[ 2061.952662] ? napi_gro_flush (./arch/x86/include/asm/bitops.h:94
./include/asm-generic/bitops/instrumented-non-atomic.h:45
net/core/gro.c:346 net/core/gro.c:361)
[ 2061.956804] ? dev_gro_receive (./arch/x86/include/asm/bitops.h:228
(discriminator 8) ./arch/x86/include/asm/bitops.h:240 (discriminator
8) ./include/asm-generic/bitops/instrumented-non-atomic.h:142
(discriminator 8) net/core/gro.c:580 (discriminator 8))
[ 2061.961121] napi_complete_done (./include/linux/list.h:37
./include/net/gro.h:434 ./include/net/gro.h:429 net/core/dev.c:6067)
[ 2061.965437] ? napi_busy_loop (net/core/dev.c:6034)
[ 2061.969579] ixgbe_poll (drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3191)
[ 2061.973379] ? ixgbe_xdp_ring_update_tail_locked
(drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:3141)
[ 2061.978998] ? common_interrupt (arch/x86/kernel/irq.c:240)
[ 2061.983140] ? asm_sysvec_call_function_single
(./arch/x86/include/asm/idtentry.h:653)
[ 2061.988596] __napi_poll (net/core/dev.c:6498)
[ 2061.992217] napi_threaded_poll (./include/linux/netpoll.h:89
net/core/dev.c:6640)
[ 2061.996534] ? __napi_poll (net/core/dev.c:6625)
[ 2062.000411] ? migrate_enable (kernel/sched/core.c:3045)
[ 2062.004557] ? __kthread_parkme (./arch/x86/include/asm/bitops.h:207
./arch/x86/include/asm/bitops.h:239
./include/asm-generic/bitops/instrumented-non-atomic.h:142
kernel/kthread.c:271)
[ 2062.008784] ? __napi_poll (net/core/dev.c:6625)
[ 2062.012658] kthread (kernel/kthread.c:379)
[ 2062.016014] ? kthread_complete_and_exit (kernel/kthread.c:336)
[ 2062.020935] ret_from_fork (arch/x86/entry/entry_64.S:314)
[ 2062.024640] </TASK>
[ 2062.026950] Modules linked in: chaoskey
[ 2062.031016] ---[ end trace 0000000000000000 ]---
[ 2062.072482] pstore: backend (erst) writing error (-28)
[ 2062.077802] RIP: 0010:udpv6_queue_rcv_skb
(./include/linux/skbuff.h:2645 net/ipv6/udp.c:798 net/ipv6/udp.c:787)
[ 2062.083036] Code: 40 fd e9 f6 fb ff ff 89 73 70 48 c7 c7 80 77 e6
b5 44 89 c6 e8 3e 72 ef fc 31 d2 48 89 de 48 c7 c7 c0 77 e6 b5 e8 4d
e2 74 ff <0f> 0b 0f 0b e9 51 fd ff ff 0f 0b e9 1c fe ff ff 48 b8 00 00
00 00
All code
========
0: 40 fd rex std
2: e9 f6 fb ff ff jmp 0xfffffffffffffbfd
7: 89 73 70 mov %esi,0x70(%rbx)
a: 48 c7 c7 80 77 e6 b5 mov $0xffffffffb5e67780,%rdi
11: 44 89 c6 mov %r8d,%esi
14: e8 3e 72 ef fc call 0xfffffffffcef7257
19: 31 d2 xor %edx,%edx
1b: 48 89 de mov %rbx,%rsi
1e: 48 c7 c7 c0 77 e6 b5 mov $0xffffffffb5e677c0,%rdi
25: e8 4d e2 74 ff call 0xffffffffff74e277
2a:* 0f 0b ud2 <-- trapping instruction
2c: 0f 0b ud2
2e: e9 51 fd ff ff jmp 0xfffffffffffffd84
33: 0f 0b ud2
35: e9 1c fe ff ff jmp 0xfffffffffffffe56
3a: 48 rex.W
3b: b8 00 00 00 00 mov $0x0,%eax

Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: 0f 0b ud2
4: e9 51 fd ff ff jmp 0xfffffffffffffd5a
9: 0f 0b ud2
b: e9 1c fe ff ff jmp 0xfffffffffffffe2c
10: 48 rex.W
11: b8 00 00 00 00 mov $0x0,%eax
[ 2062.102033] RSP: 0018:ffff888112cd6ef0 EFLAGS: 00010286
[ 2062.107428] RAX: 0000000000000000 RBX: ffff88812623d040 RCX: ffffffffb318313a
[ 2062.114722] RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffff888112cd6ab8
[ 2062.122033] RBP: 0000000000000036 R08: 0000000000000001 R09: ffff888112cd6abf
[ 2062.129329] R10: ffffed102259ad57 R11: 000000000000005b R12: ffff88813ac9d080
[ 2062.136629] R13: dffffc0000000000 R14: ffff8881156cc0c0 R15: ffff88810c287540
[ 2062.143937] FS: 0000000000000000(0000) GS:ffff8883ef400000(0000)
knlGS:0000000000000000
[ 2062.152211] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2062.158120] CR2: 00007ff4c106066c CR3: 000000011c2f8000 CR4: 00000000003526e0
[ 2062.165415] Kernel panic - not syncing: Fatal exception in interrupt



On Tue, Jul 4, 2023 at 4:27 PM Ian Kumlien <[email protected]> wrote:
>
> More stacktraces.. =)
>

[--8<--]

>
> On Tue, Jul 4, 2023 at 4:06 PM Ian Kumlien <[email protected]> wrote:
> >
> > On Tue, Jul 4, 2023 at 3:41 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Tue, 2023-07-04 at 15:23 +0200, Ian Kumlien wrote:
> > > > On Tue, Jul 4, 2023 at 2:54 PM Paolo Abeni <[email protected]> wrote:
> > > > >
> > > > > On Tue, 2023-07-04 at 13:36 +0200, Ian Kumlien wrote:
> > > > > > Propper bug this time:
> > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > >
> > > > > To be sure, is this with the last patch I shared? this one I mean:
> > > >
> > > > The current modifications I have, on top of v6.4.1, is:
> > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > index cea28d30abb5..8552caa197f9 100644
> > > > --- a/net/core/skbuff.c
> > > > +++ b/net/core/skbuff.c
> > > > @@ -4272,6 +4272,11 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > >
> > > > skb_shinfo(skb)->frag_list = NULL;
> > > >
> > > > + /* later code will clear the gso area in the shared info */
> > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > + if (err)
> > > > + goto err_linearize;
> > > > +
> > > > while (list_skb) {
> > > > nskb = list_skb;
> > > > list_skb = list_skb->next;
> > > > @@ -4328,6 +4333,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > >
> > > > skb->prev = tail;
> > > >
> > > > + if (WARN_ON_ONCE(!skb->next))
> > > > + goto err_linearize;
> > > > +
> > > > if (skb_needs_linearize(skb, features) &&
> > > > __skb_linearize(skb))
> > > > goto err_linearize;
> > > > ---
> > > >
> > > > > https://lore.kernel.org/netdev/[email protected]/
> > > > >
> > > > > Could you please additionally enable CONFIG_DEBUG_NET in your build?
> > > >
> > > > Sure, will do
> > > >
> > > > > Could you please give a detailed description of your network topology
> > > > > and the running traffic?
> > > >
> > > > This machine has two "real interfaces" and two interfaces that runs as
> > > > bridges for virtual machines
> > > > eno1 - real internal
> > > > eno2 - bridge - internal
> > > > eno3 - real external
> > > > eno4 - bridge - external
> > > >
> > > > The bridges are used by three virtual machines, two of which are
> > > > attached on both networks
> > > >
> > > > Traffic seemed to be video streaming, at least at first, now I don't
> > > > really know. I do have a few smart devices so I assume there is
> > > > a bit of multicast traffic as well - but not really anything unusual as such.
> > >
> > > In there any XDP program running on the host side? Possibly changing
> > > the packet hdr?
> >
> > Only systemd standard things, I haven't done anything and the normal
> > nftables fw doesn't do anything special
> >
> > > Thanks!
> > >
> > > /P
> > >

2023-07-05 11:23:23

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> More stacktraces.. =)
>
> cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> [ 411.413767] ------------[ cut here ]------------
> [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> net/ipv6/udp.c:787)

I'm really running out of ideas here...

This is:

WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);

sort of hint skb being shared (skb->users > 1) while enqueued in
multiple places (bridge local input and br forward/flood to tun
device). I audited the bridge mc flooding code, and I could not find
how a shared skb could land into the local input path.

Anyway the other splats reported here and in later emails are
compatible with shared skbs.

The above leads to another bunch of questions:
* can you reproduce the issue after disabling 'rx-gro-list' on the
ingress device? (while keeping 'rx-udp-gro-forwarding' on).
* do you have by chance qdiscs on top of the VM tun devices?

The last patch I shared was buggy, as it attempts to unclone the skb
after already touching skb_shared_info.

Could you please replace such patch with the following?

Thanks!

Paolo
---
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6c5915efbc17..0b0f4309506d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4261,6 +4261,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb_push(skb, -skb_network_offset(skb) + offset);

+ if (WARN_ON_ONCE(skb_shared(skb))) {
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (!skb)
+ goto err_linearize;
+ }
+
+ /* later code will clear the gso area in the shared info */
+ err = skb_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
skb_shinfo(skb)->frag_list = NULL;

while (list_skb) {


2023-07-05 12:02:31

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
>
> On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > More stacktraces.. =)
> >
> > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > [ 411.413767] ------------[ cut here ]------------
> > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > net/ipv6/udp.c:787)
>
> I'm really running out of ideas here...
>
> This is:
>
> WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
>
> sort of hint skb being shared (skb->users > 1) while enqueued in
> multiple places (bridge local input and br forward/flood to tun
> device). I audited the bridge mc flooding code, and I could not find
> how a shared skb could land into the local input path.
>
> Anyway the other splats reported here and in later emails are
> compatible with shared skbs.
>
> The above leads to another bunch of questions:
> * can you reproduce the issue after disabling 'rx-gro-list' on the
> ingress device? (while keeping 'rx-udp-gro-forwarding' on).

With rx-gro-list off, as in never turned on, everything seems to run fine

> * do you have by chance qdiscs on top of the VM tun devices?

default qdisc is fq

> The last patch I shared was buggy, as it attempts to unclone the skb
> after already touching skb_shared_info.
>
> Could you please replace such patch with the following?

Will do, building atm

> Thanks!
>
> Paolo
> ---
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 6c5915efbc17..0b0f4309506d 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4261,6 +4261,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb_push(skb, -skb_network_offset(skb) + offset);
>
> + if (WARN_ON_ONCE(skb_shared(skb))) {
> + skb = skb_share_check(skb, GFP_ATOMIC);
> + if (!skb)
> + goto err_linearize;
> + }
> +
> + /* later code will clear the gso area in the shared info */
> + err = skb_unclone(skb, GFP_ATOMIC);
> + if (err)
> + goto err_linearize;
> +
> skb_shinfo(skb)->frag_list = NULL;
>
> while (list_skb) {
>

2023-07-05 13:54:07

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > More stacktraces.. =)
> > >
> > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > [ 411.413767] ------------[ cut here ]------------
> > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > net/ipv6/udp.c:787)
> >
> > I'm really running out of ideas here...
> >
> > This is:
> >
> > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> >
> > sort of hint skb being shared (skb->users > 1) while enqueued in
> > multiple places (bridge local input and br forward/flood to tun
> > device). I audited the bridge mc flooding code, and I could not find
> > how a shared skb could land into the local input path.
> >
> > Anyway the other splats reported here and in later emails are
> > compatible with shared skbs.
> >
> > The above leads to another bunch of questions:
> > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
>
> With rx-gro-list off, as in never turned on, everything seems to run fine
>
> > * do you have by chance qdiscs on top of the VM tun devices?
>
> default qdisc is fq

IIRC libvirt could reset the qdisc to noqueue for the owned tun
devices.

Could you please report the output of:

tc -d -s qdisc show dev <tun dev name>

Thanks!

/P


2023-07-05 14:14:59

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
>
> On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > More stacktraces.. =)
> > > >
> > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > [ 411.413767] ------------[ cut here ]------------
> > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > net/ipv6/udp.c:787)
> > >
> > > I'm really running out of ideas here...
> > >
> > > This is:
> > >
> > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > >
> > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > multiple places (bridge local input and br forward/flood to tun
> > > device). I audited the bridge mc flooding code, and I could not find
> > > how a shared skb could land into the local input path.
> > >
> > > Anyway the other splats reported here and in later emails are
> > > compatible with shared skbs.
> > >
> > > The above leads to another bunch of questions:
> > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> >
> > With rx-gro-list off, as in never turned on, everything seems to run fine
> >
> > > * do you have by chance qdiscs on top of the VM tun devices?
> >
> > default qdisc is fq
>
> IIRC libvirt could reset the qdisc to noqueue for the owned tun
> devices.
>
> Could you please report the output of:
>
> tc -d -s qdisc show dev <tun dev name>

I don't have these set:
CONFIG_NET_SCH_INGRESS
CONFIG_NET_SCHED

so tc just gives an error...

> Thanks!
>
> /P
>

2023-07-06 08:59:11

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > >
> > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > More stacktraces.. =)
> > > > >
> > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > net/ipv6/udp.c:787)
> > > >
> > > > I'm really running out of ideas here...
> > > >
> > > > This is:
> > > >
> > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > >
> > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > multiple places (bridge local input and br forward/flood to tun
> > > > device). I audited the bridge mc flooding code, and I could not find
> > > > how a shared skb could land into the local input path.
> > > >
> > > > Anyway the other splats reported here and in later emails are
> > > > compatible with shared skbs.
> > > >
> > > > The above leads to another bunch of questions:
> > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > >
> > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > >
> > > > * do you have by chance qdiscs on top of the VM tun devices?
> > >
> > > default qdisc is fq
> >
> > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > devices.
> >
> > Could you please report the output of:
> >
> > tc -d -s qdisc show dev <tun dev name>
>
> I don't have these set:
> CONFIG_NET_SCH_INGRESS
> CONFIG_NET_SCHED
>
> so tc just gives an error...

The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
CONFIG_NET_SCHED, you should not have a default qdisc, too ;)

Could you please share your kernel config?

Thanks!

/P


2023-07-06 11:44:25

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > >
> > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > More stacktraces.. =)
> > > > > >
> > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > net/ipv6/udp.c:787)
> > > > >
> > > > > I'm really running out of ideas here...
> > > > >
> > > > > This is:
> > > > >
> > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > >
> > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > how a shared skb could land into the local input path.
> > > > >
> > > > > Anyway the other splats reported here and in later emails are
> > > > > compatible with shared skbs.
> > > > >
> > > > > The above leads to another bunch of questions:
> > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > >
> > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > >
> > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > >
> > > > default qdisc is fq
> > >
> > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > devices.
> > >
> > > Could you please report the output of:
> > >
> > > tc -d -s qdisc show dev <tun dev name>
> >
> > I don't have these set:
> > CONFIG_NET_SCH_INGRESS
> > CONFIG_NET_SCHED
> >
> > so tc just gives an error...
>
> The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> CONFIG_NET_SCHED, you should not have a default qdisc, too ;)

Well it's still set in sysctl - dunno if it fails

> Could you please share your kernel config?

Sure...

As a side note, it hasn't crashed - no traces since we did the last change

For reference, this is git diff on the running kernels source tree:
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index cea28d30abb5..1b2394ebaf33 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb_push(skb, -skb_network_offset(skb) + offset);

+ if (WARN_ON_ONCE(skb_shared(skb))) {
+ skb = skb_share_check(skb, GFP_ATOMIC);
+ if (!skb)
+ goto err_linearize;
+ }
+
+ /* later code will clear the gso area in the shared info */
+ err = skb_header_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
skb_shinfo(skb)->frag_list = NULL;

while (list_skb) {
@@ -4328,6 +4339,9 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,

skb->prev = tail;

+ if (WARN_ON_ONCE(!skb->next))
+ goto err_linearize;
+
if (skb_needs_linearize(skb, features) &&
__skb_linearize(skb))
goto err_linearize;
---

> Thanks!
>
> /P
>


Attachments:
config-6.4.1 (125.65 kB)

2023-07-06 13:25:02

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > >
> > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > >
> > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > More stacktraces.. =)
> > > > > > >
> > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > net/ipv6/udp.c:787)
> > > > > >
> > > > > > I'm really running out of ideas here...
> > > > > >
> > > > > > This is:
> > > > > >
> > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > >
> > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > how a shared skb could land into the local input path.
> > > > > >
> > > > > > Anyway the other splats reported here and in later emails are
> > > > > > compatible with shared skbs.
> > > > > >
> > > > > > The above leads to another bunch of questions:
> > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > >
> > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > >
> > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > >
> > > > > default qdisc is fq
> > > >
> > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > devices.
> > > >
> > > > Could you please report the output of:
> > > >
> > > > tc -d -s qdisc show dev <tun dev name>
> > >
> > > I don't have these set:
> > > CONFIG_NET_SCH_INGRESS
> > > CONFIG_NET_SCHED
> > >
> > > so tc just gives an error...
> >
> > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
>
> Well it's still set in sysctl - dunno if it fails
>
> > Could you please share your kernel config?
>
> Sure...
>
> As a side note, it hasn't crashed - no traces since we did the last change

It sounds like an encouraging sing! (last famous words...). I'll wait 1
more day, than I'll submit formally...

> For reference, this is git diff on the running kernels source tree:
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index cea28d30abb5..1b2394ebaf33 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
>
> skb_push(skb, -skb_network_offset(skb) + offset);
>
> + if (WARN_ON_ONCE(skb_shared(skb))) {
> + skb = skb_share_check(skb, GFP_ATOMIC);
> + if (!skb)
> + goto err_linearize;
> + }
> +
> + /* later code will clear the gso area in the shared info */
> + err = skb_header_unclone(skb, GFP_ATOMIC);
> + if (err)
> + goto err_linearize;
> +
> skb_shinfo(skb)->frag_list = NULL;
>
> while (list_skb) {

...the above check only, as the other 2 should only catch-up side
effects of lack of this one. In any case the above address a real
issue, so we likely want it no-matter-what.

Thanks,

Paolo


2023-07-06 14:18:22

by Eric Dumazet

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
>
> On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > >
> > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > >
> > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > More stacktraces.. =)
> > > > > > > >
> > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > net/ipv6/udp.c:787)
> > > > > > >
> > > > > > > I'm really running out of ideas here...
> > > > > > >
> > > > > > > This is:
> > > > > > >
> > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > >
> > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > how a shared skb could land into the local input path.
> > > > > > >
> > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > compatible with shared skbs.
> > > > > > >
> > > > > > > The above leads to another bunch of questions:
> > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > >
> > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > >
> > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > >
> > > > > > default qdisc is fq
> > > > >
> > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > devices.
> > > > >
> > > > > Could you please report the output of:
> > > > >
> > > > > tc -d -s qdisc show dev <tun dev name>
> > > >
> > > > I don't have these set:
> > > > CONFIG_NET_SCH_INGRESS
> > > > CONFIG_NET_SCHED
> > > >
> > > > so tc just gives an error...
> > >
> > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> >
> > Well it's still set in sysctl - dunno if it fails
> >
> > > Could you please share your kernel config?
> >
> > Sure...
> >
> > As a side note, it hasn't crashed - no traces since we did the last change
>
> It sounds like an encouraging sing! (last famous words...). I'll wait 1
> more day, than I'll submit formally...
>
> > For reference, this is git diff on the running kernels source tree:
> > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > index cea28d30abb5..1b2394ebaf33 100644
> > --- a/net/core/skbuff.c
> > +++ b/net/core/skbuff.c
> > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> >
> > skb_push(skb, -skb_network_offset(skb) + offset);
> >
> > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > + skb = skb_share_check(skb, GFP_ATOMIC);
> > + if (!skb)
> > + goto err_linearize;
> > + }
> > +
> > + /* later code will clear the gso area in the shared info */
> > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > + if (err)
> > + goto err_linearize;
> > +
> > skb_shinfo(skb)->frag_list = NULL;
> >
> > while (list_skb) {
>
> ...the above check only, as the other 2 should only catch-up side
> effects of lack of this one. In any case the above address a real
> issue, so we likely want it no-matter-what.
>

Interesting, I wonder if this could also fix some syzbot reports
Willem and I are investigating.

Any idea of when the bug was 'added' or 'revealed' ?

2023-07-06 15:09:34

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > > >
> > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > More stacktraces.. =)
> > > > > > > > >
> > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > >
> > > > > > > > I'm really running out of ideas here...
> > > > > > > >
> > > > > > > > This is:
> > > > > > > >
> > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > >
> > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > how a shared skb could land into the local input path.
> > > > > > > >
> > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > compatible with shared skbs.
> > > > > > > >
> > > > > > > > The above leads to another bunch of questions:
> > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > >
> > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > >
> > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > >
> > > > > > > default qdisc is fq
> > > > > >
> > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > devices.
> > > > > >
> > > > > > Could you please report the output of:
> > > > > >
> > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > >
> > > > > I don't have these set:
> > > > > CONFIG_NET_SCH_INGRESS
> > > > > CONFIG_NET_SCHED
> > > > >
> > > > > so tc just gives an error...
> > > >
> > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > >
> > > Well it's still set in sysctl - dunno if it fails
> > >
> > > > Could you please share your kernel config?
> > >
> > > Sure...
> > >
> > > As a side note, it hasn't crashed - no traces since we did the last change
> >
> > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > more day, than I'll submit formally...
> >
> > > For reference, this is git diff on the running kernels source tree:
> > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > index cea28d30abb5..1b2394ebaf33 100644
> > > --- a/net/core/skbuff.c
> > > +++ b/net/core/skbuff.c
> > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > >
> > > skb_push(skb, -skb_network_offset(skb) + offset);
> > >
> > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > + if (!skb)
> > > + goto err_linearize;
> > > + }
> > > +
> > > + /* later code will clear the gso area in the shared info */
> > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > + if (err)
> > > + goto err_linearize;
> > > +
> > > skb_shinfo(skb)->frag_list = NULL;
> > >
> > > while (list_skb) {
> >
> > ...the above check only, as the other 2 should only catch-up side
> > effects of lack of this one. In any case the above address a real
> > issue, so we likely want it no-matter-what.
> >
>
> Interesting, I wonder if this could also fix some syzbot reports
> Willem and I are investigating.
>
> Any idea of when the bug was 'added' or 'revealed' ?

The issue specifically addressed above should be present since
frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
fraglist chaining."). AFAICS triggering it requires non trivial setup -
mcast rx on bridge with frag-list enabled and forwarding to multiple
ports - so perhaps syzkaller found it later due to improvements on its
side ?!?

Cheers,

Paolo


2023-07-06 16:33:21

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, Jul 6, 2023 at 4:04 PM Paolo Abeni <[email protected]> wrote:
>
> On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> > On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > > > >
> > > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > > More stacktraces.. =)
> > > > > > > > > >
> > > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > > >
> > > > > > > > > I'm really running out of ideas here...
> > > > > > > > >
> > > > > > > > > This is:
> > > > > > > > >
> > > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > > >
> > > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > > how a shared skb could land into the local input path.
> > > > > > > > >
> > > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > > compatible with shared skbs.
> > > > > > > > >
> > > > > > > > > The above leads to another bunch of questions:
> > > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > > >
> > > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > > >
> > > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > > >
> > > > > > > > default qdisc is fq
> > > > > > >
> > > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > > devices.
> > > > > > >
> > > > > > > Could you please report the output of:
> > > > > > >
> > > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > > >
> > > > > > I don't have these set:
> > > > > > CONFIG_NET_SCH_INGRESS
> > > > > > CONFIG_NET_SCHED
> > > > > >
> > > > > > so tc just gives an error...
> > > > >
> > > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > > >
> > > > Well it's still set in sysctl - dunno if it fails
> > > >
> > > > > Could you please share your kernel config?
> > > >
> > > > Sure...
> > > >
> > > > As a side note, it hasn't crashed - no traces since we did the last change
> > >
> > > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > > more day, than I'll submit formally...
> > >
> > > > For reference, this is git diff on the running kernels source tree:
> > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > index cea28d30abb5..1b2394ebaf33 100644
> > > > --- a/net/core/skbuff.c
> > > > +++ b/net/core/skbuff.c
> > > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > >
> > > > skb_push(skb, -skb_network_offset(skb) + offset);
> > > >
> > > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > > + if (!skb)
> > > > + goto err_linearize;
> > > > + }
> > > > +
> > > > + /* later code will clear the gso area in the shared info */
> > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > + if (err)
> > > > + goto err_linearize;
> > > > +
> > > > skb_shinfo(skb)->frag_list = NULL;
> > > >
> > > > while (list_skb) {
> > >
> > > ...the above check only, as the other 2 should only catch-up side
> > > effects of lack of this one. In any case the above address a real
> > > issue, so we likely want it no-matter-what.
> > >
> >
> > Interesting, I wonder if this could also fix some syzbot reports
> > Willem and I are investigating.
> >
> > Any idea of when the bug was 'added' or 'revealed' ?
>
> The issue specifically addressed above should be present since
> frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
> fraglist chaining."). AFAICS triggering it requires non trivial setup -
> mcast rx on bridge with frag-list enabled and forwarding to multiple
> ports - so perhaps syzkaller found it later due to improvements on its
> side ?!?

I'm also a bit afraid that we just haven't triggered it - i don't see
any warnings or anything... :/

> Cheers,
>
> Paolo
>

2023-07-06 17:29:56

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, 2023-07-06 at 18:17 +0200, Ian Kumlien wrote:
> On Thu, Jul 6, 2023 at 4:04 PM Paolo Abeni <[email protected]> wrote:
> >
> > On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> > > On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
> > > >
> > > > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > >
> > > > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > > > More stacktraces.. =)
> > > > > > > > > > >
> > > > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > > > >
> > > > > > > > > > I'm really running out of ideas here...
> > > > > > > > > >
> > > > > > > > > > This is:
> > > > > > > > > >
> > > > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > > > >
> > > > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > > > how a shared skb could land into the local input path.
> > > > > > > > > >
> > > > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > > > compatible with shared skbs.
> > > > > > > > > >
> > > > > > > > > > The above leads to another bunch of questions:
> > > > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > > > >
> > > > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > > > >
> > > > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > > > >
> > > > > > > > > default qdisc is fq
> > > > > > > >
> > > > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > > > devices.
> > > > > > > >
> > > > > > > > Could you please report the output of:
> > > > > > > >
> > > > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > > > >
> > > > > > > I don't have these set:
> > > > > > > CONFIG_NET_SCH_INGRESS
> > > > > > > CONFIG_NET_SCHED
> > > > > > >
> > > > > > > so tc just gives an error...
> > > > > >
> > > > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > > > >
> > > > > Well it's still set in sysctl - dunno if it fails
> > > > >
> > > > > > Could you please share your kernel config?
> > > > >
> > > > > Sure...
> > > > >
> > > > > As a side note, it hasn't crashed - no traces since we did the last change
> > > >
> > > > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > > > more day, than I'll submit formally...
> > > >
> > > > > For reference, this is git diff on the running kernels source tree:
> > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > > index cea28d30abb5..1b2394ebaf33 100644
> > > > > --- a/net/core/skbuff.c
> > > > > +++ b/net/core/skbuff.c
> > > > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > > >
> > > > > skb_push(skb, -skb_network_offset(skb) + offset);
> > > > >
> > > > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > > > + if (!skb)
> > > > > + goto err_linearize;
> > > > > + }
> > > > > +
> > > > > + /* later code will clear the gso area in the shared info */
> > > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > > + if (err)
> > > > > + goto err_linearize;
> > > > > +
> > > > > skb_shinfo(skb)->frag_list = NULL;
> > > > >
> > > > > while (list_skb) {
> > > >
> > > > ...the above check only, as the other 2 should only catch-up side
> > > > effects of lack of this one. In any case the above address a real
> > > > issue, so we likely want it no-matter-what.
> > > >
> > >
> > > Interesting, I wonder if this could also fix some syzbot reports
> > > Willem and I are investigating.
> > >
> > > Any idea of when the bug was 'added' or 'revealed' ?
> >
> > The issue specifically addressed above should be present since
> > frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
> > fraglist chaining."). AFAICS triggering it requires non trivial setup -
> > mcast rx on bridge with frag-list enabled and forwarding to multiple
> > ports - so perhaps syzkaller found it later due to improvements on its
> > side ?!?
>
> I'm also a bit afraid that we just haven't triggered it - i don't see
> any warnings or anything... :/

Let me try to clarify: I hope/think that this chunk alone:

+ /* later code will clear the gso area in the shared info */
+ err = skb_header_unclone(skb, GFP_ATOMIC);
+ if (err)
+ goto err_linearize;
+
skb_shinfo(skb)->frag_list = NULL;

while (list_skb) {

does the magic/avoids the skb corruptions -> it everything goes well,
you should not see any warnings at all. Running 'nstat' in the DUT
should give some hints about reaching the relevant code paths.

Cheers,

Paolo


2023-07-06 22:43:41

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Thu, Jul 6, 2023 at 7:10 PM Paolo Abeni <[email protected]> wrote:
> On Thu, 2023-07-06 at 18:17 +0200, Ian Kumlien wrote:
> > On Thu, Jul 6, 2023 at 4:04 PM Paolo Abeni <[email protected]> wrote:
> > >
> > > On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> > > > On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
> > > > >
> > > > > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > > > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > > > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > >
> > > > > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > > > > More stacktraces.. =)
> > > > > > > > > > > >
> > > > > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > > > > >
> > > > > > > > > > > I'm really running out of ideas here...
> > > > > > > > > > >
> > > > > > > > > > > This is:
> > > > > > > > > > >
> > > > > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > > > > >
> > > > > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > > > > how a shared skb could land into the local input path.
> > > > > > > > > > >
> > > > > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > > > > compatible with shared skbs.
> > > > > > > > > > >
> > > > > > > > > > > The above leads to another bunch of questions:
> > > > > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > > > > >
> > > > > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > > > > >
> > > > > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > > > > >
> > > > > > > > > > default qdisc is fq
> > > > > > > > >
> > > > > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > > > > devices.
> > > > > > > > >
> > > > > > > > > Could you please report the output of:
> > > > > > > > >
> > > > > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > > > > >
> > > > > > > > I don't have these set:
> > > > > > > > CONFIG_NET_SCH_INGRESS
> > > > > > > > CONFIG_NET_SCHED
> > > > > > > >
> > > > > > > > so tc just gives an error...
> > > > > > >
> > > > > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > > > > >
> > > > > > Well it's still set in sysctl - dunno if it fails
> > > > > >
> > > > > > > Could you please share your kernel config?
> > > > > >
> > > > > > Sure...
> > > > > >
> > > > > > As a side note, it hasn't crashed - no traces since we did the last change
> > > > >
> > > > > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > > > > more day, than I'll submit formally...
> > > > >
> > > > > > For reference, this is git diff on the running kernels source tree:
> > > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > > > index cea28d30abb5..1b2394ebaf33 100644
> > > > > > --- a/net/core/skbuff.c
> > > > > > +++ b/net/core/skbuff.c
> > > > > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > > > >
> > > > > > skb_push(skb, -skb_network_offset(skb) + offset);
> > > > > >
> > > > > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > > > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > > > > + if (!skb)
> > > > > > + goto err_linearize;
> > > > > > + }
> > > > > > +
> > > > > > + /* later code will clear the gso area in the shared info */
> > > > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > > > + if (err)
> > > > > > + goto err_linearize;
> > > > > > +
> > > > > > skb_shinfo(skb)->frag_list = NULL;
> > > > > >
> > > > > > while (list_skb) {
> > > > >
> > > > > ...the above check only, as the other 2 should only catch-up side
> > > > > effects of lack of this one. In any case the above address a real
> > > > > issue, so we likely want it no-matter-what.
> > > > >
> > > >
> > > > Interesting, I wonder if this could also fix some syzbot reports
> > > > Willem and I are investigating.
> > > >
> > > > Any idea of when the bug was 'added' or 'revealed' ?
> > >
> > > The issue specifically addressed above should be present since
> > > frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
> > > fraglist chaining."). AFAICS triggering it requires non trivial setup -
> > > mcast rx on bridge with frag-list enabled and forwarding to multiple
> > > ports - so perhaps syzkaller found it later due to improvements on its
> > > side ?!?
> >
> > I'm also a bit afraid that we just haven't triggered it - i don't see
> > any warnings or anything... :/
>
> Let me try to clarify: I hope/think that this chunk alone:
>
> + /* later code will clear the gso area in the shared info */
> + err = skb_header_unclone(skb, GFP_ATOMIC);
> + if (err)
> + goto err_linearize;
> +
> skb_shinfo(skb)->frag_list = NULL;
>
> while (list_skb) {
>
> does the magic/avoids the skb corruptions -> it everything goes well,
> you should not see any warnings at all. Running 'nstat' in the DUT
> should give some hints about reaching the relevant code paths.

Sorry about the html mail... but...

I was fully expecting a warning from:
if (WARN_ON_ONCE(skb_shared(skb))) {

But I could be completely wrong and things =)

Which fields would i be looking at in nstat
nstat
#kernel
IpInReceives 11076 0.0
IpForwDatagrams 2384 0.0
IpInDelivers 5107 0.0
IpOutRequests 3478 0.0
IcmpInMsgs 42 0.0
IcmpInDestUnreachs 9 0.0
IcmpInEchos 32 0.0
IcmpInEchoReps 1 0.0
IcmpOutMsgs 49 0.0
IcmpOutDestUnreachs 15 0.0
IcmpOutEchos 2 0.0
IcmpOutEchoReps 32 0.0
IcmpMsgInType0 1 0.0
IcmpMsgInType3 9 0.0
IcmpMsgInType8 32 0.0
IcmpMsgOutType0 32 0.0
IcmpMsgOutType3 15 0.0
IcmpMsgOutType8 2 0.0
TcpInSegs 220 0.0
TcpOutSegs 381 0.0
UdpInDatagrams 4893 0.0
UdpInErrors 5 0.0
UdpOutDatagrams 655 0.0
UdpRcvbufErrors 5 0.0
UdpIgnoredMulti 86 0.0
Ip6InReceives 7155 0.0
Ip6InDelivers 7139 0.0
Ip6OutRequests 136 0.0
Ip6OutNoRoutes 8 0.0
Ip6InMcastPkts 7146 0.0
Ip6OutMcastPkts 130 0.0
Ip6InOctets 1062180 0.0
Ip6OutOctets 41215 0.0
Ip6InMcastOctets 1061292 0.0
Ip6OutMcastOctets 40807 0.0
Ip6InNoECTPkts 7845 0.0
Icmp6InMsgs 44 0.0
Icmp6OutMsgs 21 0.0
Icmp6InGroupMembQueries 8 0.0
Icmp6InRouterAdvertisements 4 0.0
Icmp6InNeighborSolicits 6 0.0
Icmp6InNeighborAdvertisements 26 0.0
Icmp6OutNeighborSolicits 3 0.0
Icmp6OutNeighborAdvertisements 6 0.0
Icmp6OutMLDv2Reports 12 0.0
Icmp6InType130 8 0.0
Icmp6InType134 4 0.0
Icmp6InType135 6 0.0
Icmp6InType136 26 0.0
Icmp6OutType135 3 0.0
Icmp6OutType136 6 0.0
Icmp6OutType143 12 0.0
Udp6InDatagrams 6537 0.0
Udp6InErrors 1248 0.0
Udp6OutDatagrams 115 0.0
Udp6RcvbufErrors 1248 0.0
TcpExtTCPHPAcks 200 0.0
TcpExtTCPBacklogCoalesce 3 0.0
TcpExtIPReversePathFilter 89 0.0
TcpExtTCPAutoCorking 4 0.0
TcpExtTCPOrigDataSent 381 0.0
TcpExtTCPDelivered 381 0.0
IpExtInMcastPkts 4174 0.0
IpExtOutMcastPkts 68 0.0
IpExtInBcastPkts 86 0.0
IpExtOutBcastPkts 4 0.0
IpExtInOctets 1866664 0.0
IpExtOutOctets 1715287 0.0
IpExtInMcastOctets 539751 0.0
IpExtOutMcastOctets 25636 0.0
IpExtInBcastOctets 7131 0.0
IpExtOutBcastOctets 304 0.0
IpExtInNoECTPkts 12158 0.0

But we do have a extreme uptime for this test:
00:31:44 up 1 day, 10:55, 2 users, load average: 0,77, 0,75, 0,82

> Cheers,
>
> Paolo
>

2023-07-06 23:09:09

by Ian Kumlien

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Fri, Jul 7, 2023 at 12:32 AM Ian Kumlien <[email protected]> wrote:
> On Thu, Jul 6, 2023 at 7:10 PM Paolo Abeni <[email protected]> wrote:
> > Let me try to clarify: I hope/think that this chunk alone:
> >
> > + /* later code will clear the gso area in the shared info */
> > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > + if (err)
> > + goto err_linearize;
> > +
> > skb_shinfo(skb)->frag_list = NULL;
> >
> > while (list_skb) {
> >
> > does the magic/avoids the skb corruptions -> it everything goes well,
> > you should not see any warnings at all. Running 'nstat' in the DUT
> > should give some hints about reaching the relevant code paths.

Ah yeah... I'm a bit tired atm - I see your point - with moving it up a bit.

So anyway, Tested-by: [email protected] etc =)

2023-07-07 07:17:53

by Paolo Abeni

[permalink] [raw]
Subject: Re: [Intel-wired-lan] bug with rx-udp-gro-forwarding offloading?

On Fri, 2023-07-07 at 00:32 +0200, Ian Kumlien wrote:
> On Thu, Jul 6, 2023 at 7:10 PM Paolo Abeni <[email protected]> wrote:
> > On Thu, 2023-07-06 at 18:17 +0200, Ian Kumlien wrote:
> > > On Thu, Jul 6, 2023 at 4:04 PM Paolo Abeni <[email protected]> wrote:
> > > >
> > > > On Thu, 2023-07-06 at 15:56 +0200, Eric Dumazet wrote:
> > > > > On Thu, Jul 6, 2023 at 3:02 PM Paolo Abeni <[email protected]> wrote:
> > > > > >
> > > > > > On Thu, 2023-07-06 at 13:27 +0200, Ian Kumlien wrote:
> > > > > > > On Thu, Jul 6, 2023 at 10:42 AM Paolo Abeni <[email protected]> wrote:
> > > > > > > > On Wed, 2023-07-05 at 15:58 +0200, Ian Kumlien wrote:
> > > > > > > > > On Wed, Jul 5, 2023 at 3:29 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > > >
> > > > > > > > > > On Wed, 2023-07-05 at 13:32 +0200, Ian Kumlien wrote:
> > > > > > > > > > > On Wed, Jul 5, 2023 at 12:28 PM Paolo Abeni <[email protected]> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, 2023-07-04 at 16:27 +0200, Ian Kumlien wrote:
> > > > > > > > > > > > > More stacktraces.. =)
> > > > > > > > > > > > >
> > > > > > > > > > > > > cat bug.txt | ./scripts/decode_stacktrace.sh vmlinux
> > > > > > > > > > > > > [ 411.413767] ------------[ cut here ]------------
> > > > > > > > > > > > > [ 411.413792] WARNING: CPU: 9 PID: 942 at include/net/ud p.h:509
> > > > > > > > > > > > > udpv6_queue_rcv_skb (./include/net/udp.h:509 net/ipv6/udp.c:800
> > > > > > > > > > > > > net/ipv6/udp.c:787)
> > > > > > > > > > > >
> > > > > > > > > > > > I'm really running out of ideas here...
> > > > > > > > > > > >
> > > > > > > > > > > > This is:
> > > > > > > > > > > >
> > > > > > > > > > > > WARN_ON_ONCE(UDP_SKB_CB(skb)->partial_cov);
> > > > > > > > > > > >
> > > > > > > > > > > > sort of hint skb being shared (skb->users > 1) while enqueued in
> > > > > > > > > > > > multiple places (bridge local input and br forward/flood to tun
> > > > > > > > > > > > device). I audited the bridge mc flooding code, and I could not find
> > > > > > > > > > > > how a shared skb could land into the local input path.
> > > > > > > > > > > >
> > > > > > > > > > > > Anyway the other splats reported here and in later emails are
> > > > > > > > > > > > compatible with shared skbs.
> > > > > > > > > > > >
> > > > > > > > > > > > The above leads to another bunch of questions:
> > > > > > > > > > > > * can you reproduce the issue after disabling 'rx-gro-list' on the
> > > > > > > > > > > > ingress device? (while keeping 'rx-udp-gro-forwarding' on).
> > > > > > > > > > >
> > > > > > > > > > > With rx-gro-list off, as in never turned on, everything seems to run fine
> > > > > > > > > > >
> > > > > > > > > > > > * do you have by chance qdiscs on top of the VM tun devices?
> > > > > > > > > > >
> > > > > > > > > > > default qdisc is fq
> > > > > > > > > >
> > > > > > > > > > IIRC libvirt could reset the qdisc to noqueue for the owned tun
> > > > > > > > > > devices.
> > > > > > > > > >
> > > > > > > > > > Could you please report the output of:
> > > > > > > > > >
> > > > > > > > > > tc -d -s qdisc show dev <tun dev name>
> > > > > > > > >
> > > > > > > > > I don't have these set:
> > > > > > > > > CONFIG_NET_SCH_INGRESS
> > > > > > > > > CONFIG_NET_SCHED
> > > > > > > > >
> > > > > > > > > so tc just gives an error...
> > > > > > > >
> > > > > > > > The above is confusing. AS CONFIG_NET_SCH_DEFAULT depends on
> > > > > > > > CONFIG_NET_SCHED, you should not have a default qdisc, too ;)
> > > > > > >
> > > > > > > Well it's still set in sysctl - dunno if it fails
> > > > > > >
> > > > > > > > Could you please share your kernel config?
> > > > > > >
> > > > > > > Sure...
> > > > > > >
> > > > > > > As a side note, it hasn't crashed - no traces since we did the last change
> > > > > >
> > > > > > It sounds like an encouraging sing! (last famous words...). I'll wait 1
> > > > > > more day, than I'll submit formally...
> > > > > >
> > > > > > > For reference, this is git diff on the running kernels source tree:
> > > > > > > diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> > > > > > > index cea28d30abb5..1b2394ebaf33 100644
> > > > > > > --- a/net/core/skbuff.c
> > > > > > > +++ b/net/core/skbuff.c
> > > > > > > @@ -4270,6 +4270,17 @@ struct sk_buff *skb_segment_list(struct sk_buff *skb,
> > > > > > >
> > > > > > > skb_push(skb, -skb_network_offset(skb) + offset);
> > > > > > >
> > > > > > > + if (WARN_ON_ONCE(skb_shared(skb))) {
> > > > > > > + skb = skb_share_check(skb, GFP_ATOMIC);
> > > > > > > + if (!skb)
> > > > > > > + goto err_linearize;
> > > > > > > + }
> > > > > > > +
> > > > > > > + /* later code will clear the gso area in the shared info */
> > > > > > > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > > > > > > + if (err)
> > > > > > > + goto err_linearize;
> > > > > > > +
> > > > > > > skb_shinfo(skb)->frag_list = NULL;
> > > > > > >
> > > > > > > while (list_skb) {
> > > > > >
> > > > > > ...the above check only, as the other 2 should only catch-up side
> > > > > > effects of lack of this one. In any case the above address a real
> > > > > > issue, so we likely want it no-matter-what.
> > > > > >
> > > > >
> > > > > Interesting, I wonder if this could also fix some syzbot reports
> > > > > Willem and I are investigating.
> > > > >
> > > > > Any idea of when the bug was 'added' or 'revealed' ?
> > > >
> > > > The issue specifically addressed above should be present since
> > > > frag_list introduction commit 3a1296a38d0c ("net: Support GRO/GSO
> > > > fraglist chaining."). AFAICS triggering it requires non trivial setup -
> > > > mcast rx on bridge with frag-list enabled and forwarding to multiple
> > > > ports - so perhaps syzkaller found it later due to improvements on its
> > > > side ?!?
> > >
> > > I'm also a bit afraid that we just haven't triggered it - i don't see
> > > any warnings or anything... :/
> >
> > Let me try to clarify: I hope/think that this chunk alone:
> >
> > + /* later code will clear the gso area in the shared info */
> > + err = skb_header_unclone(skb, GFP_ATOMIC);
> > + if (err)
> > + goto err_linearize;
> > +
> > skb_shinfo(skb)->frag_list = NULL;
> >
> > while (list_skb) {
> >
> > does the magic/avoids the skb corruptions -> it everything goes well,
> > you should not see any warnings at all. Running 'nstat' in the DUT
> > should give some hints about reaching the relevant code paths.
>
> Sorry about the html mail... but...
>
> I was fully expecting a warning from:
> if (WARN_ON_ONCE(skb_shared(skb))) {
>
> But I could be completely wrong and things =)
>
> Which fields would i be looking at in nstat
[...]
> UdpInDatagrams 4893 0.0
[...]
> Ip6InMcastPkts 7146 0.0
[...]
> Ip6InMcastOctets 1061292 0.0

The above ones. We have ingress mcast traffic, but the figures are
inconclusive about GRO aggregation taking place (Ip6InMcastOctets /
Ip6InMcastPkts > MTU would prove that). Similar thing for IPv4 mcast.

Still the change look sane, the alive time encouraging. I'll submit it
formally with your reported/tested-by tags.

Many thanks for all the debugging effort!

Paolo