2021-08-06 08:30:17

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 5.13 06/35] bpf, sockmap: On cleanup we additionally need to remove cached skb

From: John Fastabend <[email protected]>

[ Upstream commit 476d98018f32e68e7c5d4e8456940cf2b6d66f10 ]

Its possible if a socket is closed and the receive thread is under memory
pressure it may have cached a skb. We need to ensure these skbs are
free'd along with the normal ingress_skb queue.

Before 799aa7f98d53 ("skmsg: Avoid lock_sock() in sk_psock_backlog()") tear
down and backlog processing both had sock_lock for the common case of
socket close or unhash. So it was not possible to have both running in
parrallel so all we would need is the kfree in those kernels.

But, latest kernels include the commit 799aa7f98d5e and this requires a
bit more work. Without the ingress_lock guarding reading/writing the
state->skb case its possible the tear down could run before the state
update causing it to leak memory or worse when the backlog reads the state
it could potentially run interleaved with the tear down and we might end up
free'ing the state->skb from tear down side but already have the reference
from backlog side. To resolve such races we wrap accesses in ingress_lock
on both sides serializing tear down and backlog case. In both cases this
only happens after an EAGAIN error case so having an extra lock in place
is likely fine. The normal path will skip the locks.

Note, we check state->skb before grabbing lock. This works because
we can only enqueue with the mutex we hold already. Avoiding a race
on adding state->skb after the check. And if tear down path is running
that is also fine if the tear down path then removes state->skb we
will simply set skb=NULL and the subsequent goto is skipped. This
slight complication avoids locking in normal case.

With this fix we no longer see this warning splat from tcp side on
socket close when we hit the above case with redirect to ingress self.

[224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
[224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
[224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G I 5.14.0-rc1alu+ #181
[224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
[224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220
[224913.935923] Code: 8b 83 20 02 00 00 85 c0 75 20 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 df e8 2b 11 fe ff eb c3 0f 0b e9 7c ff ff ff 0f 0b eb ce <0f> 0b 5b 5d 41 5c 41 5d 41 5e 41 5f c3 90 0f 1f 44 00 00 41 57 41
[224913.935932] RSP: 0018:ffff88816271fd38 EFLAGS: 00010206
[224913.935941] RAX: 0000000000000ae8 RBX: ffff88815acd5240 RCX: dffffc0000000000
[224913.935948] RDX: 0000000000000003 RSI: 0000000000000ae8 RDI: ffff88815acd5460
[224913.935954] RBP: ffff88815acd5460 R08: ffffffff955c0ae8 R09: fffffbfff2e6f543
[224913.935961] R10: ffffffff9737aa17 R11: fffffbfff2e6f542 R12: ffff88815acd5390
[224913.935967] R13: ffff88815acd5480 R14: ffffffff98d0c080 R15: ffffffff96267500
[224913.935974] FS: 00007f86e6bd1700(0000) GS:ffff888451cc0000(0000) knlGS:0000000000000000
[224913.935981] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[224913.935988] CR2: 000000c0008eb000 CR3: 00000001020e0005 CR4: 00000000003706e0
[224913.935994] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[224913.936000] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[224913.936007] Call Trace:
[224913.936016] inet_csk_destroy_sock+0xba/0x1f0
[224913.936033] __tcp_close+0x620/0x790
[224913.936047] tcp_close+0x20/0x80
[224913.936056] inet_release+0x8f/0xf0
[224913.936070] __sock_release+0x72/0x120
[224913.936083] sock_close+0x14/0x20

Fixes: a136678c0bdbb ("bpf: sk_msg, zap ingress queue on psock down")
Signed-off-by: John Fastabend <[email protected]>
Signed-off-by: Andrii Nakryiko <[email protected]>
Acked-by: Jakub Sitnicki <[email protected]>
Acked-by: Martin KaFai Lau <[email protected]>
Link: https://lore.kernel.org/bpf/[email protected]
Signed-off-by: Sasha Levin <[email protected]>
---
net/core/skmsg.c | 35 +++++++++++++++++++++++++++++------
1 file changed, 29 insertions(+), 6 deletions(-)

diff --git a/net/core/skmsg.c b/net/core/skmsg.c
index b088fe07fc00..7e7205e93258 100644
--- a/net/core/skmsg.c
+++ b/net/core/skmsg.c
@@ -613,23 +613,42 @@ static void sock_drop(struct sock *sk, struct sk_buff *skb)
kfree_skb(skb);
}

+static void sk_psock_skb_state(struct sk_psock *psock,
+ struct sk_psock_work_state *state,
+ struct sk_buff *skb,
+ int len, int off)
+{
+ spin_lock_bh(&psock->ingress_lock);
+ if (sk_psock_test_state(psock, SK_PSOCK_TX_ENABLED)) {
+ state->skb = skb;
+ state->len = len;
+ state->off = off;
+ } else {
+ sock_drop(psock->sk, skb);
+ }
+ spin_unlock_bh(&psock->ingress_lock);
+}
+
static void sk_psock_backlog(struct work_struct *work)
{
struct sk_psock *psock = container_of(work, struct sk_psock, work);
struct sk_psock_work_state *state = &psock->work_state;
- struct sk_buff *skb;
+ struct sk_buff *skb = NULL;
bool ingress;
u32 len, off;
int ret;

mutex_lock(&psock->work_mutex);
- if (state->skb) {
+ if (unlikely(state->skb)) {
+ spin_lock_bh(&psock->ingress_lock);
skb = state->skb;
len = state->len;
off = state->off;
state->skb = NULL;
- goto start;
+ spin_unlock_bh(&psock->ingress_lock);
}
+ if (skb)
+ goto start;

while ((skb = skb_dequeue(&psock->ingress_skb))) {
len = skb->len;
@@ -644,9 +663,8 @@ static void sk_psock_backlog(struct work_struct *work)
len, ingress);
if (ret <= 0) {
if (ret == -EAGAIN) {
- state->skb = skb;
- state->len = len;
- state->off = off;
+ sk_psock_skb_state(psock, state, skb,
+ len, off);
goto end;
}
/* Hard errors break pipe and stop xmit. */
@@ -745,6 +763,11 @@ static void __sk_psock_zap_ingress(struct sk_psock *psock)
skb_bpf_redirect_clear(skb);
sock_drop(psock->sk, skb);
}
+ kfree_skb(psock->work_state.skb);
+ /* We null the skb here to ensure that calls to sk_psock_backlog
+ * do not pick up the free'd skb.
+ */
+ psock->work_state.skb = NULL;
__sk_psock_purge_ingress_msg(psock);
}

--
2.30.2




2021-08-07 17:48:14

by Naresh Kamboju

[permalink] [raw]
Subject: Re: [PATCH 5.13 06/35] bpf, sockmap: On cleanup we additionally need to remove cached skb

While running packetdrill test suite on qemu aarch64 the following warning
noticed with stable-rc 5.13.9-rc1 kernel intermittently.

On Fri, 6 Aug 2021 at 13:52, Greg Kroah-Hartman
<[email protected]> wrote:
>
> From: John Fastabend <[email protected]>
>
> [ Upstream commit 476d98018f32e68e7c5d4e8456940cf2b6d66f10 ]

<trim>

> With this fix we no longer see this warning splat from tcp side on
> socket close when we hit the above case with redirect to ingress self.
>
> [224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
> [224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
> [224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G I 5.14.0-rc1alu+ #181
> [224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
> [224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220


steps to reproduce:
-------------------
# boot qemu aarch64 with stable-rc 5.13.9-rc1 kernel
# /usr/bin/qemu-system-aarch64 -cpu max -machine virt,accel=kvm
-nographic -net nic,model=virtio,macaddr=BC:DD:AD:CC:09:01 -net tap -m
4096 -monitor none -kernel kernel/Image.gz --append "console=ttyAMA0
root=/dev/vda rw" -hda
rpb-console-image-lkft-juno-20210525221209.rootfs.ext4 -m 4096 -smp 4
-nographic

# Run test
# cd ./automated/linux/packetdrill/
# ./configure
# make all
# python3 ./packetdrill/run_all.py -v -l -L

## Build
* kernel: 5.13.9-rc1
* git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
* git branch: linux-5.13.y
* git commit: 1eb1590ab470d5f73dd2d20a7196bca35fa3d3e7
* git describe: v5.13.8-36-g1eb1590ab470
* test details:
https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.13.y/build/v5.13.8-36-g1eb1590ab470


Crash log:
-----------
INFO: Skip installing package dependency for packetdrill
/opt/packetdrill /lava-3242370/0/tests/1_packetdrill/automated/linux/packetdrill
[ 11.329564] tun: Universal TUN/TAP device driver, 1.6
[ 14.801347] TCP: tun0: Driver has suspect GRO implementation, TCP
performance may be compromised.
[ 15.113626] ------------[ cut here ]------------
[ 15.115380] WARNING: CPU: 3 PID: 671 at net/core/stream.c:207
sk_stream_kill_queues+0x104/0x130
[ 15.118527] Modules linked in: tun crct10dif_ce rfkill fuse
[ 15.120361] CPU: 3 PID: 671 Comm: packetdrill Not tainted 5.13.9-rc1 #1
[ 15.122587] Hardware name: linux,dummy-virt (DT)
[ 15.124123] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 15.126117] pc : sk_stream_kill_queues+0x104/0x130
[ 15.127764] lr : inet_csk_destroy_sock+0x68/0x130
[ 15.129326] sp : ffff8000109f36d0
[ 15.130484] x29: ffff8000109f36d0 x28: 0000000000000005 x27: fffffffffffffff2
[ 15.132807] x26: 0000000000000001 x25: ffffa05141000900 x24: ffff6f1e0b51dc40
[ 15.136643] x23: 0000000000000000 x22: 0000000000000000 x21: ffff6f1e11e7e054
[ 15.139117] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
[ 15.141494] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
[ 15.143903] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
[ 15.146292] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa051410d5148
[ 15.148660] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : 0000000000000202
[ 15.151071] x5 : ffff6f1e08a0dd00 x4 : 0000000000000004 x3 : 0000000000000007
[ 15.153435] x2 : ffff6f1e08a0e560 x1 : 0000000000000180 x0 : 00000000fffffe80
[ 15.155835] Call trace:
[ 15.156675] sk_stream_kill_queues+0x104/0x130
[ 15.163667] inet_csk_destroy_sock+0x68/0x130
[ 15.165139] tcp_done+0x120/0x1b0
[ 15.166274] tcp_reset+0x74/0x130
[ 15.167445] tcp_validate_incoming+0x394/0x510
[ 15.168953] tcp_rcv_state_process+0x2d8/0x15c0
[ 15.170512] tcp_v4_do_rcv+0x15c/0x2d4
[ 15.171798] tcp_v4_rcv+0x9c0/0xaa4
[ 15.173009] ip_protocol_deliver_rcu+0x4c/0x184
[ 15.174597] ip_local_deliver_finish+0x74/0x90
[ 15.176103] ip_local_deliver+0x88/0x130
[ 15.177429] ip_rcv+0x7c/0x130
[ 15.178512] __netif_receive_skb_one_core+0x60/0x8c
[ 15.180143] __netif_receive_skb+0x20/0x70
[ 15.181527] netif_receive_skb+0x48/0x1e0
[ 15.182930] tun_get_user+0xbe4/0xd70 [tun]
[ 15.184368] tun_chr_write_iter+0x68/0xf0 [tun]
[ 15.185936] do_iter_readv_writev+0x100/0x1a4
[ 15.187448] do_iter_write+0x98/0x1fc
[ 15.188735] vfs_writev+0xb4/0x170
[ 15.189956] do_writev+0x7c/0x140
[ 15.191115] __arm64_sys_writev+0x2c/0x40
[ 15.192481] invoke_syscall+0x50/0x120
[ 15.193754] el0_svc_common.constprop.0+0xf4/0x104
[ 15.195391] do_el0_svc+0x34/0x9c
[ 15.196558] el0_svc+0x2c/0x54
[ 15.197653] el0_sync_handler+0xa4/0x130
[ 15.199054] el0_sync+0x198/0x1c0
[ 15.200182] ---[ end trace c9faa1be6c93e4fb ]---
[ 15.201864] ------------[ cut here ]------------
[ 15.203404] WARNING: CPU: 3 PID: 671 at net/core/stream.c:208
sk_stream_kill_queues+0x110/0x130
[ 15.206141] Modules linked in: tun crct10dif_ce rfkill fuse
[ 15.207976] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
5.13.9-rc1 #1
[ 15.210542] Hardware name: linux,dummy-virt (DT)
[ 15.212029] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
[ 15.213947] pc : sk_stream_kill_queues+0x110/0x130
[ 15.215546] lr : inet_csk_destroy_sock+0x68/0x130
[ 15.217097] sp : ffff8000109f36d0
[ 15.218204] x29: ffff8000109f36d0 x28: 0000000000000005 x27: fffffffffffffff2
[ 15.220548] x26: 0000000000000001 x25: ffffa05141000900 x24: ffff6f1e0b51dc40
[ 15.223012] x23: 0000000000000000 x22: 0000000000000000 x21: ffff6f1e11e7e054
[ 15.225350] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
[ 15.227705] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
[ 15.230042] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
[ 15.232454] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa051410d5148
[ 15.234896] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : 0000000000000202
[ 15.237544] x5 : ffff6f1e08a0dd00 x4 : 0000000000000004 x3 : 0000000000000007
[ 15.239933] x2 : ffff6f1e08a0e560 x1 : 0000000000000180 x0 : 0000000000000180
[ 15.242315] Call trace:
[ 15.243203] sk_stream_kill_queues+0x110/0x130
[ 15.244721] inet_csk_destroy_sock+0x68/0x130
[ 15.246191] tcp_done+0x120/0x1b0
[ 15.247388] tcp_reset+0x74/0x130
[ 15.248540] tcp_validate_incoming+0x394/0x510
[ 15.250057] tcp_rcv_state_process+0x2d8/0x15c0
[ 15.251636] tcp_v4_do_rcv+0x15c/0x2d4
[ 15.252908] tcp_v4_rcv+0x9c0/0xaa4
[ 15.254095] ip_protocol_deliver_rcu+0x4c/0x184
[ 15.255643] ip_local_deliver_finish+0x74/0x90
[ 15.257138] ip_local_deliver+0x88/0x130
[ 15.258516] ip_rcv+0x7c/0x130
[ 15.259573] __netif_receive_skb_one_core+0x60/0x8c
[ 15.261561] __netif_receive_skb+0x20/0x70
[ 15.263355] netif_receive_skb+0x48/0x1e0
[ 15.264721] tun_get_user+0xbe4/0xd70 [tun]
[ 15.266110] tun_chr_write_iter+0x68/0xf0 [tun]
[ 15.267704] do_iter_readv_writev+0x100/0x1a4
[ 15.269162] do_iter_write+0x98/0x1fc
[ 15.270721] vfs_writev+0xb4/0x170
[ 15.271948] do_writev+0x7c/0x140
[ 15.273120] __arm64_sys_writev+0x2c/0x40
[ 15.274527] invoke_syscall+0x50/0x120
[ 15.275805] el0_svc_common.constprop.0+0xf4/0x104
[ 15.277278] do_el0_svc+0x34/0x9c
[ 15.278832] el0_svc+0x2c/0x54
[ 15.280081] el0_sync_handler+0xa4/0x130
[ 15.281614] el0_sync+0x198/0x1c0
[ 15.282899] ---[ end trace c9faa1be6c93e4fc ]---
[ 15.284650] ------------[ cut here ]------------
[ 15.286116] WARNING: CPU: 3 PID: 671 at net/ipv4/af_inet.c:156
inet_sock_destruct+0x190/0x1b0
[ 15.288999] Modules linked in: tun crct10dif_ce rfkill fuse
[ 15.290884] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
5.13.9-rc1 #1
[ 15.294511] Hardware name: linux,dummy-virt (DT)
[ 15.296041] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 15.298073] pc : inet_sock_destruct+0x190/0x1b0
[ 15.299743] lr : __sk_destruct+0x38/0x23c
[ 15.301154] sp : ffff8000109f37c0
[ 15.302291] x29: ffff8000109f37c0 x28: 0000000000000005 x27: fffffffffffffff2
[ 15.304703] x26: 0000000000000001 x25: ffffa05141000900 x24: ffffa05142a8be80
[ 15.307100] x23: 0000000000000000 x22: ffff6f1e08a0dd08 x21: ffff6f1e08a0dc80
[ 15.309419] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
[ 15.311764] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
[ 15.314056] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
[ 15.316392] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa05141003acc
[ 15.318680] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : ffff6f1e08a0ddbc
[ 15.320920] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffceccfd9db000
[ 15.323197] x2 : ffff6f1e02c00810 x1 : 0000000000000180 x0 : 00000000fffffe80
[ 15.325465] Call trace:
[ 15.326270] inet_sock_destruct+0x190/0x1b0
[ 15.327643] __sk_destruct+0x38/0x23c
[ 15.328842] __sk_free+0x80/0x120
[ 15.329923] sk_free+0x68/0x90
[ 15.330941] sock_put+0x5c/0x80
[ 15.331963] tcp_v4_rcv+0xa40/0xaa4
[ 15.333091] ip_protocol_deliver_rcu+0x4c/0x184
[ 15.334585] ip_local_deliver_finish+0x74/0x90
[ 15.336000] ip_local_deliver+0x88/0x130
[ 15.337273] ip_rcv+0x7c/0x130
[ 15.338277] __netif_receive_skb_one_core+0x60/0x8c
[ 15.339869] __netif_receive_skb+0x20/0x70
[ 15.341221] netif_receive_skb+0x48/0x1e0
[ 15.342558] tun_get_user+0xbe4/0xd70 [tun]
[ 15.343929] tun_chr_write_iter+0x68/0xf0 [tun]
[ 15.345395] do_iter_readv_writev+0x100/0x1a4
[ 15.346831] do_iter_write+0x98/0x1fc
[ 15.348025] vfs_writev+0xb4/0x170
[ 15.349136] do_writev+0x7c/0x140
[ 15.350224] __arm64_sys_writev+0x2c/0x40
[ 15.351556] invoke_syscall+0x50/0x120
[ 15.352789] el0_svc_common.constprop.0+0xf4/0x104
[ 15.354335] do_el0_svc+0x34/0x9c
[ 15.355451] el0_svc+0x2c/0x54
[ 15.356460] el0_sync_handler+0xa4/0x130
[ 15.357740] el0_sync+0x198/0x1c0
[ 15.358861] ---[ end trace c9faa1be6c93e4fd ]---
[ 15.360555] ------------[ cut here ]------------
[ 15.361959] WARNING: CPU: 3 PID: 671 at net/ipv4/af_inet.c:157
inet_sock_destruct+0x16c/0x1b0
[ 15.364603] Modules linked in: tun crct10dif_ce rfkill fuse
[ 15.366276] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
5.13.9-rc1 #1
[ 15.373031] Hardware name: linux,dummy-virt (DT)
[ 15.377359] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
[ 15.381502] pc : inet_sock_destruct+0x16c/0x1b0
[ 15.384348] lr : __sk_destruct+0x38/0x23c
[ 15.387724] sp : ffff8000109f37c0
[ 15.388842] x29: ffff8000109f37c0 x28: 0000000000000005 x27: fffffffffffffff2
[ 15.391840] x26: 0000000000000001 x25: ffffa05141000900 x24: ffffa05142a8be80
[ 15.394837] x23: 0000000000000000 x22: ffff6f1e08a0dd08 x21: ffff6f1e08a0dc80
[ 15.398292] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
[ 15.401315] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
[ 15.403655] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
[ 15.405967] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa05141003acc
[ 15.408287] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : ffff6f1e08a0ddbc
[ 15.410616] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffceccfd9db000
[ 15.412969] x2 : ffff6f1e02c00810 x1 : 0000000000000180 x0 : 0000000000000180
[ 15.415332] Call trace:
[ 15.416187] inet_sock_destruct+0x16c/0x1b0
[ 15.417551] __sk_destruct+0x38/0x23c
[ 15.418793] __sk_free+0x80/0x120
[ 15.419886] sk_free+0x68/0x90
[ 15.420916] sock_put+0x5c/0x80
[ 15.421974] tcp_v4_rcv+0xa40/0xaa4
[ 15.423182] ip_protocol_deliver_rcu+0x4c/0x184
[ 15.424649] ip_local_deliver_finish+0x74/0x90
[ 15.426214] ip_local_deliver+0x88/0x130
[ 15.427869] ip_rcv+0x7c/0x130
[ 15.428884] __netif_receive_skb_one_core+0x60/0x8c
[ 15.430545] __netif_receive_skb+0x20/0x70
[ 15.432679] netif_receive_skb+0x48/0x1e0
[ 15.435651] tun_get_user+0xbe4/0xd70 [tun]
[ 15.437662] tun_chr_write_iter+0x68/0xf0 [tun]
[ 15.439835] do_iter_readv_writev+0x100/0x1a4
[ 15.441836] do_iter_write+0x98/0x1fc
[ 15.443588] vfs_writev+0xb4/0x170
[ 15.445150] do_writev+0x7c/0x140
[ 15.446713] __arm64_sys_writev+0x2c/0x40
[ 15.448639] invoke_syscall+0x50/0x120
[ 15.450373] el0_svc_common.constprop.0+0xf4/0x104
[ 15.452642] do_el0_svc+0x34/0x9c
[ 15.454230] el0_svc+0x2c/0x54
[ 15.455743] el0_sync_handler+0xa4/0x130
[ 15.457486] el0_sync+0x198/0x1c0
[ 15.458659] ---[ end trace c9faa1be6c93e4fe ]---

Reported-by: Linux Kernel Functional Testing <[email protected]>

--
Linaro LKFT
https://lkft.linaro.org

2021-08-09 20:26:21

by John Fastabend

[permalink] [raw]
Subject: Re: [PATCH 5.13 06/35] bpf, sockmap: On cleanup we additionally need to remove cached skb

Naresh Kamboju wrote:
> While running packetdrill test suite on qemu aarch64 the following warning
> noticed with stable-rc 5.13.9-rc1 kernel intermittently.
>
> On Fri, 6 Aug 2021 at 13:52, Greg Kroah-Hartman
> <[email protected]> wrote:
> >
> > From: John Fastabend <[email protected]>
> >
> > [ Upstream commit 476d98018f32e68e7c5d4e8456940cf2b6d66f10 ]
>
> <trim>
>
> > With this fix we no longer see this warning splat from tcp side on
> > socket close when we hit the above case with redirect to ingress self.
> >
> > [224913.935822] WARNING: CPU: 3 PID: 32100 at net/core/stream.c:208 sk_stream_kill_queues+0x212/0x220
> > [224913.935841] Modules linked in: fuse overlay bpf_preload x86_pkg_temp_thermal intel_uncore wmi_bmof squashfs sch_fq_codel efivarfs ip_tables x_tables uas xhci_pci ixgbe mdio xfrm_algo xhci_hcd wmi
> > [224913.935897] CPU: 3 PID: 32100 Comm: fgs-bench Tainted: G I 5.14.0-rc1alu+ #181
> > [224913.935908] Hardware name: Dell Inc. Precision 5820 Tower/002KVM, BIOS 1.9.2 01/24/2019
> > [224913.935914] RIP: 0010:sk_stream_kill_queues+0x212/0x220
>
>
> steps to reproduce:
> -------------------
> # boot qemu aarch64 with stable-rc 5.13.9-rc1 kernel
> # /usr/bin/qemu-system-aarch64 -cpu max -machine virt,accel=kvm
> -nographic -net nic,model=virtio,macaddr=BC:DD:AD:CC:09:01 -net tap -m
> 4096 -monitor none -kernel kernel/Image.gz --append "console=ttyAMA0
> root=/dev/vda rw" -hda
> rpb-console-image-lkft-juno-20210525221209.rootfs.ext4 -m 4096 -smp 4
> -nographic
>
> # Run test
> # cd ./automated/linux/packetdrill/
> # ./configure
> # make all
> # python3 ./packetdrill/run_all.py -v -l -L
>
> ## Build
> * kernel: 5.13.9-rc1
> * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> * git branch: linux-5.13.y
> * git commit: 1eb1590ab470d5f73dd2d20a7196bca35fa3d3e7
> * git describe: v5.13.8-36-g1eb1590ab470
> * test details:
> https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.13.y/build/v5.13.8-36-g1eb1590ab470

Hi Naresh,

The fix here should only be visible with sockmap BPF programs running. The
trace below doesn't seem to have any of the BPF calls. I tried to parse
the test details, but I didn't see how packetdrill and the BPF tests
are related. The test that would be relevant linked here seems to be
passing in your case.

https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.13.y/build/v5.13.8-36-g1eb1590ab470/testrun/5382761/suite/kselftest-bpf/test/bpf.test_sockmap/details/

Am I correct in assuming that you bisected to this patch somehow, but
did not have any BPF programs running?

Thanks,
John

>
>
> Crash log:
> -----------
> INFO: Skip installing package dependency for packetdrill
> /opt/packetdrill /lava-3242370/0/tests/1_packetdrill/automated/linux/packetdrill
> [ 11.329564] tun: Universal TUN/TAP device driver, 1.6
> [ 14.801347] TCP: tun0: Driver has suspect GRO implementation, TCP
> performance may be compromised.
> [ 15.113626] ------------[ cut here ]------------
> [ 15.115380] WARNING: CPU: 3 PID: 671 at net/core/stream.c:207
> sk_stream_kill_queues+0x104/0x130
> [ 15.118527] Modules linked in: tun crct10dif_ce rfkill fuse
> [ 15.120361] CPU: 3 PID: 671 Comm: packetdrill Not tainted 5.13.9-rc1 #1
> [ 15.122587] Hardware name: linux,dummy-virt (DT)
> [ 15.124123] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [ 15.126117] pc : sk_stream_kill_queues+0x104/0x130
> [ 15.127764] lr : inet_csk_destroy_sock+0x68/0x130
> [ 15.129326] sp : ffff8000109f36d0
> [ 15.130484] x29: ffff8000109f36d0 x28: 0000000000000005 x27: fffffffffffffff2
> [ 15.132807] x26: 0000000000000001 x25: ffffa05141000900 x24: ffff6f1e0b51dc40
> [ 15.136643] x23: 0000000000000000 x22: 0000000000000000 x21: ffff6f1e11e7e054
> [ 15.139117] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
> [ 15.141494] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
> [ 15.143903] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
> [ 15.146292] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa051410d5148
> [ 15.148660] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : 0000000000000202
> [ 15.151071] x5 : ffff6f1e08a0dd00 x4 : 0000000000000004 x3 : 0000000000000007
> [ 15.153435] x2 : ffff6f1e08a0e560 x1 : 0000000000000180 x0 : 00000000fffffe80
> [ 15.155835] Call trace:
> [ 15.156675] sk_stream_kill_queues+0x104/0x130
> [ 15.163667] inet_csk_destroy_sock+0x68/0x130
> [ 15.165139] tcp_done+0x120/0x1b0
> [ 15.166274] tcp_reset+0x74/0x130
> [ 15.167445] tcp_validate_incoming+0x394/0x510
> [ 15.168953] tcp_rcv_state_process+0x2d8/0x15c0
> [ 15.170512] tcp_v4_do_rcv+0x15c/0x2d4
> [ 15.171798] tcp_v4_rcv+0x9c0/0xaa4
> [ 15.173009] ip_protocol_deliver_rcu+0x4c/0x184
> [ 15.174597] ip_local_deliver_finish+0x74/0x90
> [ 15.176103] ip_local_deliver+0x88/0x130
> [ 15.177429] ip_rcv+0x7c/0x130
> [ 15.178512] __netif_receive_skb_one_core+0x60/0x8c
> [ 15.180143] __netif_receive_skb+0x20/0x70
> [ 15.181527] netif_receive_skb+0x48/0x1e0
> [ 15.182930] tun_get_user+0xbe4/0xd70 [tun]
> [ 15.184368] tun_chr_write_iter+0x68/0xf0 [tun]
> [ 15.185936] do_iter_readv_writev+0x100/0x1a4
> [ 15.187448] do_iter_write+0x98/0x1fc
> [ 15.188735] vfs_writev+0xb4/0x170
> [ 15.189956] do_writev+0x7c/0x140
> [ 15.191115] __arm64_sys_writev+0x2c/0x40
> [ 15.192481] invoke_syscall+0x50/0x120
> [ 15.193754] el0_svc_common.constprop.0+0xf4/0x104
> [ 15.195391] do_el0_svc+0x34/0x9c
> [ 15.196558] el0_svc+0x2c/0x54
> [ 15.197653] el0_sync_handler+0xa4/0x130
> [ 15.199054] el0_sync+0x198/0x1c0
> [ 15.200182] ---[ end trace c9faa1be6c93e4fb ]---
> [ 15.201864] ------------[ cut here ]------------
> [ 15.203404] WARNING: CPU: 3 PID: 671 at net/core/stream.c:208
> sk_stream_kill_queues+0x110/0x130
> [ 15.206141] Modules linked in: tun crct10dif_ce rfkill fuse
> [ 15.207976] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
> 5.13.9-rc1 #1
> [ 15.210542] Hardware name: linux,dummy-virt (DT)
> [ 15.212029] pstate: 80000005 (Nzcv daif -PAN -UAO -TCO BTYPE=--)
> [ 15.213947] pc : sk_stream_kill_queues+0x110/0x130
> [ 15.215546] lr : inet_csk_destroy_sock+0x68/0x130
> [ 15.217097] sp : ffff8000109f36d0
> [ 15.218204] x29: ffff8000109f36d0 x28: 0000000000000005 x27: fffffffffffffff2
> [ 15.220548] x26: 0000000000000001 x25: ffffa05141000900 x24: ffff6f1e0b51dc40
> [ 15.223012] x23: 0000000000000000 x22: 0000000000000000 x21: ffff6f1e11e7e054
> [ 15.225350] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
> [ 15.227705] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
> [ 15.230042] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
> [ 15.232454] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa051410d5148
> [ 15.234896] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : 0000000000000202
> [ 15.237544] x5 : ffff6f1e08a0dd00 x4 : 0000000000000004 x3 : 0000000000000007
> [ 15.239933] x2 : ffff6f1e08a0e560 x1 : 0000000000000180 x0 : 0000000000000180
> [ 15.242315] Call trace:
> [ 15.243203] sk_stream_kill_queues+0x110/0x130
> [ 15.244721] inet_csk_destroy_sock+0x68/0x130
> [ 15.246191] tcp_done+0x120/0x1b0
> [ 15.247388] tcp_reset+0x74/0x130
> [ 15.248540] tcp_validate_incoming+0x394/0x510
> [ 15.250057] tcp_rcv_state_process+0x2d8/0x15c0
> [ 15.251636] tcp_v4_do_rcv+0x15c/0x2d4
> [ 15.252908] tcp_v4_rcv+0x9c0/0xaa4
> [ 15.254095] ip_protocol_deliver_rcu+0x4c/0x184
> [ 15.255643] ip_local_deliver_finish+0x74/0x90
> [ 15.257138] ip_local_deliver+0x88/0x130
> [ 15.258516] ip_rcv+0x7c/0x130
> [ 15.259573] __netif_receive_skb_one_core+0x60/0x8c
> [ 15.261561] __netif_receive_skb+0x20/0x70
> [ 15.263355] netif_receive_skb+0x48/0x1e0
> [ 15.264721] tun_get_user+0xbe4/0xd70 [tun]
> [ 15.266110] tun_chr_write_iter+0x68/0xf0 [tun]
> [ 15.267704] do_iter_readv_writev+0x100/0x1a4
> [ 15.269162] do_iter_write+0x98/0x1fc
> [ 15.270721] vfs_writev+0xb4/0x170
> [ 15.271948] do_writev+0x7c/0x140
> [ 15.273120] __arm64_sys_writev+0x2c/0x40
> [ 15.274527] invoke_syscall+0x50/0x120
> [ 15.275805] el0_svc_common.constprop.0+0xf4/0x104
> [ 15.277278] do_el0_svc+0x34/0x9c
> [ 15.278832] el0_svc+0x2c/0x54
> [ 15.280081] el0_sync_handler+0xa4/0x130
> [ 15.281614] el0_sync+0x198/0x1c0
> [ 15.282899] ---[ end trace c9faa1be6c93e4fc ]---
> [ 15.284650] ------------[ cut here ]------------
> [ 15.286116] WARNING: CPU: 3 PID: 671 at net/ipv4/af_inet.c:156
> inet_sock_destruct+0x190/0x1b0
> [ 15.288999] Modules linked in: tun crct10dif_ce rfkill fuse
> [ 15.290884] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
> 5.13.9-rc1 #1
> [ 15.294511] Hardware name: linux,dummy-virt (DT)
> [ 15.296041] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> [ 15.298073] pc : inet_sock_destruct+0x190/0x1b0
> [ 15.299743] lr : __sk_destruct+0x38/0x23c
> [ 15.301154] sp : ffff8000109f37c0
> [ 15.302291] x29: ffff8000109f37c0 x28: 0000000000000005 x27: fffffffffffffff2
> [ 15.304703] x26: 0000000000000001 x25: ffffa05141000900 x24: ffffa05142a8be80
> [ 15.307100] x23: 0000000000000000 x22: ffff6f1e08a0dd08 x21: ffff6f1e08a0dc80
> [ 15.309419] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
> [ 15.311764] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
> [ 15.314056] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
> [ 15.316392] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa05141003acc
> [ 15.318680] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : ffff6f1e08a0ddbc
> [ 15.320920] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffceccfd9db000
> [ 15.323197] x2 : ffff6f1e02c00810 x1 : 0000000000000180 x0 : 00000000fffffe80
> [ 15.325465] Call trace:
> [ 15.326270] inet_sock_destruct+0x190/0x1b0
> [ 15.327643] __sk_destruct+0x38/0x23c
> [ 15.328842] __sk_free+0x80/0x120
> [ 15.329923] sk_free+0x68/0x90
> [ 15.330941] sock_put+0x5c/0x80
> [ 15.331963] tcp_v4_rcv+0xa40/0xaa4
> [ 15.333091] ip_protocol_deliver_rcu+0x4c/0x184
> [ 15.334585] ip_local_deliver_finish+0x74/0x90
> [ 15.336000] ip_local_deliver+0x88/0x130
> [ 15.337273] ip_rcv+0x7c/0x130
> [ 15.338277] __netif_receive_skb_one_core+0x60/0x8c
> [ 15.339869] __netif_receive_skb+0x20/0x70
> [ 15.341221] netif_receive_skb+0x48/0x1e0
> [ 15.342558] tun_get_user+0xbe4/0xd70 [tun]
> [ 15.343929] tun_chr_write_iter+0x68/0xf0 [tun]
> [ 15.345395] do_iter_readv_writev+0x100/0x1a4
> [ 15.346831] do_iter_write+0x98/0x1fc
> [ 15.348025] vfs_writev+0xb4/0x170
> [ 15.349136] do_writev+0x7c/0x140
> [ 15.350224] __arm64_sys_writev+0x2c/0x40
> [ 15.351556] invoke_syscall+0x50/0x120
> [ 15.352789] el0_svc_common.constprop.0+0xf4/0x104
> [ 15.354335] do_el0_svc+0x34/0x9c
> [ 15.355451] el0_svc+0x2c/0x54
> [ 15.356460] el0_sync_handler+0xa4/0x130
> [ 15.357740] el0_sync+0x198/0x1c0
> [ 15.358861] ---[ end trace c9faa1be6c93e4fd ]---
> [ 15.360555] ------------[ cut here ]------------
> [ 15.361959] WARNING: CPU: 3 PID: 671 at net/ipv4/af_inet.c:157
> inet_sock_destruct+0x16c/0x1b0
> [ 15.364603] Modules linked in: tun crct10dif_ce rfkill fuse
> [ 15.366276] CPU: 3 PID: 671 Comm: packetdrill Tainted: G W
> 5.13.9-rc1 #1
> [ 15.373031] Hardware name: linux,dummy-virt (DT)
> [ 15.377359] pstate: 60000005 (nZCv daif -PAN -UAO -TCO BTYPE=--)
> [ 15.381502] pc : inet_sock_destruct+0x16c/0x1b0
> [ 15.384348] lr : __sk_destruct+0x38/0x23c
> [ 15.387724] sp : ffff8000109f37c0
> [ 15.388842] x29: ffff8000109f37c0 x28: 0000000000000005 x27: fffffffffffffff2
> [ 15.391840] x26: 0000000000000001 x25: ffffa05141000900 x24: ffffa05142a8be80
> [ 15.394837] x23: 0000000000000000 x22: ffff6f1e08a0dd08 x21: ffff6f1e08a0dc80
> [ 15.398292] x20: ffff6f1e08a0dd30 x19: ffff6f1e08a0dc80 x18: 0000000000000000
> [ 15.401315] x17: 0000000000000000 x16: 0000000000000000 x15: 00000000238ecea0
> [ 15.403655] x14: 0000000000000000 x13: 000000000000dd86 x12: 000000007ffff000
> [ 15.405967] x11: 0000000000000004 x10: 0000000000000000 x9 : ffffa05141003acc
> [ 15.408287] x8 : 0000000000000000 x7 : ffffffffd3039400 x6 : ffff6f1e08a0ddbc
> [ 15.410616] x5 : 0000000000000001 x4 : 0000000000000000 x3 : ffffceccfd9db000
> [ 15.412969] x2 : ffff6f1e02c00810 x1 : 0000000000000180 x0 : 0000000000000180
> [ 15.415332] Call trace:
> [ 15.416187] inet_sock_destruct+0x16c/0x1b0
> [ 15.417551] __sk_destruct+0x38/0x23c
> [ 15.418793] __sk_free+0x80/0x120
> [ 15.419886] sk_free+0x68/0x90
> [ 15.420916] sock_put+0x5c/0x80
> [ 15.421974] tcp_v4_rcv+0xa40/0xaa4
> [ 15.423182] ip_protocol_deliver_rcu+0x4c/0x184
> [ 15.424649] ip_local_deliver_finish+0x74/0x90
> [ 15.426214] ip_local_deliver+0x88/0x130
> [ 15.427869] ip_rcv+0x7c/0x130
> [ 15.428884] __netif_receive_skb_one_core+0x60/0x8c
> [ 15.430545] __netif_receive_skb+0x20/0x70
> [ 15.432679] netif_receive_skb+0x48/0x1e0
> [ 15.435651] tun_get_user+0xbe4/0xd70 [tun]
> [ 15.437662] tun_chr_write_iter+0x68/0xf0 [tun]
> [ 15.439835] do_iter_readv_writev+0x100/0x1a4
> [ 15.441836] do_iter_write+0x98/0x1fc
> [ 15.443588] vfs_writev+0xb4/0x170
> [ 15.445150] do_writev+0x7c/0x140
> [ 15.446713] __arm64_sys_writev+0x2c/0x40
> [ 15.448639] invoke_syscall+0x50/0x120
> [ 15.450373] el0_svc_common.constprop.0+0xf4/0x104
> [ 15.452642] do_el0_svc+0x34/0x9c
> [ 15.454230] el0_svc+0x2c/0x54
> [ 15.455743] el0_sync_handler+0xa4/0x130
> [ 15.457486] el0_sync+0x198/0x1c0
> [ 15.458659] ---[ end trace c9faa1be6c93e4fe ]---
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> --
> Linaro LKFT
> https://lkft.linaro.org