Recent commit aa626da947e9 ("iavf: Detach device during reset task")
removed netif_tx_stop_all_queues() with an assumption that Tx queues
are already stopped by netif_device_detach() in the beginning of
reset task. This assumption is incorrect because during reset
task a potential link event can start Tx queues again.
Revert this change to fix this issue.
Reproducer:
1. Run some Tx traffic (e.g. iperf3) over iavf interface
2. Switch MTU of this interface in a loop
[root@host ~]# cat repro.sh
#!/bin/sh
IF=enp2s0f0v0
iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
sleep 2
while :; do
for i in 1280 1500 2000 900 ; do
ip link set $IF mtu $i
sleep 2
done
done
[root@host ~]# ./repro.sh
Result:
[ 306.199917] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40 Gbps Full Duplex
[ 308.205944] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40 Gbps Full Duplex
[ 310.103223] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 310.110179] #PF: supervisor write access in kernel mode
[ 310.115396] #PF: error_code(0x0002) - not-present page
[ 310.120526] PGD 0 P4D 0
[ 310.123057] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 310.127408] CPU: 24 PID: 183 Comm: kworker/u64:9 Kdump: loaded Not tainted 6.1.0-rc3+ #2
[ 310.135485] Hardware name: Abacus electric, s.r.o. - [email protected] Super Server/H12SSW-iN, BIOS 2.4 04/13/2022
[ 310.145728] Workqueue: iavf iavf_reset_task [iavf]
[ 310.150520] RIP: 0010:iavf_xmit_frame_ring+0xd1/0xf70 [iavf]
[ 310.156180] Code: d0 0f 86 da 00 00 00 83 e8 01 0f b7 fa 29 f8 01 c8 39 c6 0f 8f a0 08 00 00 48 8b 45 20 48 8d 14 92 bf 01 00 00 00 4c 8d 3c d0 <49> 89 5f 08 8b 43 70 66 41 89 7f 14 41 89 47 10 f6 83 82 00 00 00
[ 310.174918] RSP: 0018:ffffbb5f0082caa0 EFLAGS: 00010293
[ 310.180137] RAX: 0000000000000000 RBX: ffff92345471a6e8 RCX: 0000000000000200
[ 310.187259] RDX: 0000000000000000 RSI: 000000000000000d RDI: 0000000000000001
[ 310.194385] RBP: ffff92341d249000 R08: ffff92434987fcac R09: 0000000000000001
[ 310.201509] R10: 0000000011f683b9 R11: 0000000011f50641 R12: 0000000000000008
[ 310.208631] R13: ffff923447500000 R14: 0000000000000000 R15: 0000000000000000
[ 310.215756] FS: 0000000000000000(0000) GS:ffff92434ee00000(0000) knlGS:0000000000000000
[ 310.223835] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 310.229572] CR2: 0000000000000008 CR3: 0000000fbc210004 CR4: 0000000000770ee0
[ 310.236696] PKRU: 55555554
[ 310.239399] Call Trace:
[ 310.241844] <IRQ>
[ 310.243855] ? dst_alloc+0x5b/0xb0
[ 310.247260] dev_hard_start_xmit+0x9e/0x1f0
[ 310.251439] sch_direct_xmit+0xa0/0x370
[ 310.255276] __qdisc_run+0x13e/0x580
[ 310.258848] __dev_queue_xmit+0x431/0xd00
[ 310.262851] ? selinux_ip_postroute+0x147/0x3f0
[ 310.267377] ip_finish_output2+0x26c/0x540
Fixes: aa626da947e9 ("iavf: Detach device during reset task")
Cc: Jacob Keller <[email protected]>
Cc: Patryk Piotrowski <[email protected]>
Cc: SlawomirX Laba <[email protected]>
Signed-off-by: Ivan Vecera <[email protected]>
---
drivers/net/ethernet/intel/iavf/iavf_main.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c
index 3fc572341781..5abcd66e7c7a 100644
--- a/drivers/net/ethernet/intel/iavf/iavf_main.c
+++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
@@ -3033,6 +3033,7 @@ static void iavf_reset_task(struct work_struct *work)
if (running) {
netif_carrier_off(netdev);
+ netif_tx_stop_all_queues(netdev);
adapter->link_up = false;
iavf_napi_disable_all(adapter);
}
--
2.37.4
On 2022-11-08 10:35, Ivan Vecera wrote:
> Recent commit aa626da947e9 ("iavf: Detach device during reset task")
> removed netif_tx_stop_all_queues() with an assumption that Tx queues
> are already stopped by netif_device_detach() in the beginning of
> reset task. This assumption is incorrect because during reset
> task a potential link event can start Tx queues again.
> Revert this change to fix this issue.
>
> Reproducer:
> 1. Run some Tx traffic (e.g. iperf3) over iavf interface
> 2. Switch MTU of this interface in a loop
>
> [root@host ~]# cat repro.sh
> #!/bin/sh
>
> IF=enp2s0f0v0
>
> iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
> sleep 2
>
> while :; do
> for i in 1280 1500 2000 900 ; do
> ip link set $IF mtu $i
> sleep 2
> done
> done
With this patch applied iavf doesn't crash anymore but after a few
cycles with the reproducer tx timeouts are observed.
[ 47.551151] iavf 0000:00:09.0 eth0: NIC Link is Up Speed is 10 Gbps Full Duplex
[ 54.035902] ------------[ cut here ]------------
[ 54.037397] NETDEV WATCHDOG: eth0 (iavf): transmit queue 3 timed out
[ 54.039264] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:526 dev_watchdog+0x20f/0x250
[ 54.041524] Modules linked in: 8021q intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass rapl pcspkr drm ramoops reed_solomon crct10dif_pclmul crc32_pclmul crc32c_intel ata_generic pata_acpi ghash_clmulni_intel ata_piix aesni_intel crypto_simd iavf libata be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
[ 54.049723] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.1.0-rc2+ #90
[ 54.051049] Hardware name: Red Hat KVM, BIOS 1.15.0-2.module+el8.6.0+14757+c25ee005 04/01/2014
[ 54.052898] RIP: 0010:dev_watchdog+0x20f/0x250
[ 54.053907] Code: 00 e9 4d ff ff ff 48 89 df c6 05 92 24 96 01 01 e8 c6 f2 f8 ff 44 89 e9 48 89 de 48 c7 c7 28 7f f6 a0 48 89 c2 e8 6e 65 23 00 <0f> 0b e9 2f ff ff ff e8 25 06 2a 00 85 c0 74 b5 80 3d 74 1b 96 01
[ 54.057282] RSP: 0018:ffffaf56c00e0e80 EFLAGS: 00010282
[ 54.058164] RAX: 0000000000000000 RBX: ffff993ed95b8000 RCX: 0000000000000103
[ 54.059345] RDX: 0000000000000103 RSI: 00000000000000f6 RDI: 00000000ffffffff
[ 54.060473] RBP: ffff993ed95b8508 R08: 0000000000000000 R09: c0000000fff7ffff
[ 54.061558] R10: 0000000000000001 R11: ffffaf56c00e0d18 R12: ffff993ed95b8420
[ 54.062640] R13: 0000000000000003 R14: ffff993ed95b8508 R15: ffff993ef74a06c0
[ 54.063681] FS: 0000000000000000(0000) GS:ffff993ef7480000(0000) knlGS:0000000000000000
[ 54.064867] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 54.065654] CR2: 00007f42309e1280 CR3: 0000000107f6a003 CR4: 0000000000170ee0
[ 54.066612] Call Trace:
[ 54.066985] <IRQ>
[ 54.067265] ? mq_change_real_num_tx+0xd0/0xd0
[ 54.067844] call_timer_fn+0xa1/0x2c0
[ 54.068330] ? mq_change_real_num_tx+0xd0/0xd0
[ 54.068916] run_timer_softirq+0x527/0x550
[ 54.069447] ? lock_is_held_type+0xd8/0x130
[ 54.069998] __do_softirq+0xc3/0x481
[ 54.070469] irq_exit_rcu+0xe4/0x120
[ 54.070963] sysvec_apic_timer_interrupt+0x9e/0xc0
[ 54.071604] </IRQ>
[ 54.071909] <TASK>
[ 54.072223] asm_sysvec_apic_timer_interrupt+0x16/0x20
[ 54.072942] RIP: 0010:default_idle+0x10/0x20
[ 54.073533] Code: 89 df 31 f6 5b 5d e9 ff 1c a5 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 eb 07 0f 00 2d f2 2a 42 00 fb f4 <c3> 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 65
This only occurs when the device is detached and reattached during reset.
Stefan
On 11/8/2022 2:53 AM, Stefan Assmann wrote:
> On 2022-11-08 10:35, Ivan Vecera wrote:
>> Recent commit aa626da947e9 ("iavf: Detach device during reset task")
>> removed netif_tx_stop_all_queues() with an assumption that Tx queues
>> are already stopped by netif_device_detach() in the beginning of
>> reset task. This assumption is incorrect because during reset
>> task a potential link event can start Tx queues again.
>> Revert this change to fix this issue.
>>
>> Reproducer:
>> 1. Run some Tx traffic (e.g. iperf3) over iavf interface
>> 2. Switch MTU of this interface in a loop
>>
>> [root@host ~]# cat repro.sh
>> #!/bin/sh
>>
>> IF=enp2s0f0v0
>>
>> iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null &
>> sleep 2
>>
>> while :; do
>> for i in 1280 1500 2000 900 ; do
>> ip link set $IF mtu $i
>> sleep 2
>> done
>> done
>
> With this patch applied iavf doesn't crash anymore but after a few
> cycles with the reproducer tx timeouts are observed.
>
> [ 47.551151] iavf 0000:00:09.0 eth0: NIC Link is Up Speed is 10 Gbps Full Duplex
> [ 54.035902] ------------[ cut here ]------------
> [ 54.037397] NETDEV WATCHDOG: eth0 (iavf): transmit queue 3 timed out
> [ 54.039264] WARNING: CPU: 1 PID: 0 at net/sched/sch_generic.c:526 dev_watchdog+0x20f/0x250
> [ 54.041524] Modules linked in: 8021q intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass rapl pcspkr drm ramoops reed_solomon crct10dif_pclmul crc32_pclmul crc32c_intel ata_generic pata_acpi ghash_clmulni_intel ata_piix aesni_intel crypto_simd iavf libata be2iscsi bnx2i cnic uio cxgb4i cxgb4 tls libcxgbi libcxgb qla4xxx iscsi_boot_sysfs iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi
> [ 54.049723] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.1.0-rc2+ #90
> [ 54.051049] Hardware name: Red Hat KVM, BIOS 1.15.0-2.module+el8.6.0+14757+c25ee005 04/01/2014
> [ 54.052898] RIP: 0010:dev_watchdog+0x20f/0x250
> [ 54.053907] Code: 00 e9 4d ff ff ff 48 89 df c6 05 92 24 96 01 01 e8 c6 f2 f8 ff 44 89 e9 48 89 de 48 c7 c7 28 7f f6 a0 48 89 c2 e8 6e 65 23 00 <0f> 0b e9 2f ff ff ff e8 25 06 2a 00 85 c0 74 b5 80 3d 74 1b 96 01
> [ 54.057282] RSP: 0018:ffffaf56c00e0e80 EFLAGS: 00010282
> [ 54.058164] RAX: 0000000000000000 RBX: ffff993ed95b8000 RCX: 0000000000000103
> [ 54.059345] RDX: 0000000000000103 RSI: 00000000000000f6 RDI: 00000000ffffffff
> [ 54.060473] RBP: ffff993ed95b8508 R08: 0000000000000000 R09: c0000000fff7ffff
> [ 54.061558] R10: 0000000000000001 R11: ffffaf56c00e0d18 R12: ffff993ed95b8420
> [ 54.062640] R13: 0000000000000003 R14: ffff993ed95b8508 R15: ffff993ef74a06c0
> [ 54.063681] FS: 0000000000000000(0000) GS:ffff993ef7480000(0000) knlGS:0000000000000000
> [ 54.064867] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 54.065654] CR2: 00007f42309e1280 CR3: 0000000107f6a003 CR4: 0000000000170ee0
> [ 54.066612] Call Trace:
> [ 54.066985] <IRQ>
> [ 54.067265] ? mq_change_real_num_tx+0xd0/0xd0
> [ 54.067844] call_timer_fn+0xa1/0x2c0
> [ 54.068330] ? mq_change_real_num_tx+0xd0/0xd0
> [ 54.068916] run_timer_softirq+0x527/0x550
> [ 54.069447] ? lock_is_held_type+0xd8/0x130
> [ 54.069998] __do_softirq+0xc3/0x481
> [ 54.070469] irq_exit_rcu+0xe4/0x120
> [ 54.070963] sysvec_apic_timer_interrupt+0x9e/0xc0
> [ 54.071604] </IRQ>
> [ 54.071909] <TASK>
> [ 54.072223] asm_sysvec_apic_timer_interrupt+0x16/0x20
> [ 54.072942] RIP: 0010:default_idle+0x10/0x20
> [ 54.073533] Code: 89 df 31 f6 5b 5d e9 ff 1c a5 ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc 0f 1f 44 00 00 eb 07 0f 00 2d f2 2a 42 00 fb f4 <c3> 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 0f 1f 44 00 00 65
>
> This only occurs when the device is detached and reattached during reset.
Hi Ivan,
Was there going to be an update to the patch to resolve this? If not,
I'll take what there is now.
Thanks,
Tony
> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of Ivan
> Vecera
> Sent: Tuesday, November 8, 2022 10:36 AM
> To: [email protected]
> Cc: SlawomirX Laba <[email protected]>; Eric Dumazet
> <[email protected]>; moderated list:INTEL ETHERNET DRIVERS <intel-
> [email protected]>; open list <[email protected]>;
> Piotrowski, Patryk <[email protected]>; Jakub Kicinski
> <[email protected]>; Paolo Abeni <[email protected]>; David S. Miller
> <[email protected]>; [email protected]
> Subject: [Intel-wired-lan] [PATCH net] iavf: Fix a crash during reset task
>
> Recent commit aa626da947e9 ("iavf: Detach device during reset task") removed
> netif_tx_stop_all_queues() with an assumption that Tx queues are already
> stopped by netif_device_detach() in the beginning of reset task. This assumption
> is incorrect because during reset task a potential link event can start Tx queues
> again.
> Revert this change to fix this issue.
>
> Reproducer:
> 1. Run some Tx traffic (e.g. iperf3) over iavf interface 2. Switch MTU of this
> interface in a loop
>
> [root@host ~]# cat repro.sh
> #!/bin/sh
>
> IF=enp2s0f0v0
>
> iperf3 -c 192.168.0.1 -t 600 --logfile /dev/null & sleep 2
>
> while :; do
> for i in 1280 1500 2000 900 ; do
> ip link set $IF mtu $i
> sleep 2
> done
> done
> [root@host ~]# ./repro.sh
>
> Result:
> [ 306.199917] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40 Gbps Full
> Duplex [ 308.205944] iavf 0000:02:02.0 enp2s0f0v0: NIC Link is Up Speed is 40
> Gbps Full Duplex [ 310.103223] BUG: kernel NULL pointer dereference, address:
> 0000000000000008 [ 310.110179] #PF: supervisor write access in kernel mode [
> 310.115396] #PF: error_code(0x0002) - not-present page [ 310.120526] PGD 0
> P4D 0 [ 310.123057] Oops: 0002 [#1] PREEMPT SMP NOPTI [ 310.127408] CPU:
> 24 PID: 183 Comm: kworker/u64:9 Kdump: loaded Not tainted 6.1.0-rc3+ #2 [
> 310.135485] Hardware name: Abacus electric, s.r.o. - [email protected] Super
> Server/H12SSW-iN, BIOS 2.4 04/13/2022 [ 310.145728] Workqueue: iavf
> iavf_reset_task [iavf] [ 310.150520] RIP: 0010:iavf_xmit_frame_ring+0xd1/0xf70
> [iavf] [ 310.156180] Code: d0 0f 86 da 00 00 00 83 e8 01 0f b7 fa 29 f8 01 c8 39 c6
> 0f 8f a0 08 00 00 48 8b 45 20 48 8d 14 92 bf 01 00 00 00 4c 8d 3c d0 <49> 89 5f 08
> 8b 43 70 66 41 89 7f 14 41 89 47 10 f6 83 82 00 00 00 [ 310.174918] RSP:
> 0018:ffffbb5f0082caa0 EFLAGS: 00010293 [ 310.180137] RAX:
> 0000000000000000 RBX: ffff92345471a6e8 RCX: 0000000000000200 [
> 310.187259] RDX: 0000000000000000 RSI: 000000000000000d RDI:
> 0000000000000001 [ 310.194385] RBP: ffff92341d249000 R08: ffff92434987fcac
> R09: 0000000000000001 [ 310.201509] R10: 0000000011f683b9 R11:
> 0000000011f50641 R12: 0000000000000008 [ 310.208631] R13:
> ffff923447500000 R14: 0000000000000000 R15: 0000000000000000 [
> 310.215756] FS: 0000000000000000(0000) GS:ffff92434ee00000(0000)
> knlGS:0000000000000000 [ 310.223835] CS: 0010 DS: 0000 ES: 0000 CR0:
> 0000000080050033 [ 310.229572] CR2: 0000000000000008 CR3:
> 0000000fbc210004 CR4: 0000000000770ee0 [ 310.236696] PKRU: 55555554 [
> 310.239399] Call Trace:
> [ 310.241844] <IRQ>
> [ 310.243855] ? dst_alloc+0x5b/0xb0
> [ 310.247260] dev_hard_start_xmit+0x9e/0x1f0 [ 310.251439]
> sch_direct_xmit+0xa0/0x370 [ 310.255276] __qdisc_run+0x13e/0x580 [
> 310.258848] __dev_queue_xmit+0x431/0xd00 [ 310.262851] ?
> selinux_ip_postroute+0x147/0x3f0 [ 310.267377]
> ip_finish_output2+0x26c/0x540
>
> Fixes: aa626da947e9 ("iavf: Detach device during reset task")
> Cc: Jacob Keller <[email protected]>
> Cc: Patryk Piotrowski <[email protected]>
> Cc: SlawomirX Laba <[email protected]>
> Signed-off-by: Ivan Vecera <[email protected]>
> ---
> drivers/net/ethernet/intel/iavf/iavf_main.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c
> b/drivers/net/ethernet/intel/iavf/iavf_main.c
> index 3fc572341781..5abcd66e7c7a 100644
> --- a/drivers/net/ethernet/intel/iavf/iavf_main.c
> +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c
> @@ -3033,6 +3033,7 @@ static void iavf_reset_task(struct work_struct *work)
Tested-by: Konrad Jankowski <[email protected]>