Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp321422ybl; Thu, 15 Aug 2019 18:12:40 -0700 (PDT) X-Google-Smtp-Source: APXvYqyK9X5me0cgvM91rofI+vKYpiHgySdVxOhHBXaQ7P4t3JqAAjFMgvX3JvuIJ4EXcAjkOV/S X-Received: by 2002:a62:b615:: with SMTP id j21mr8033047pff.190.1565917959953; Thu, 15 Aug 2019 18:12:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565917959; cv=none; d=google.com; s=arc-20160816; b=J3B3Fvw+IvziyWO71aZDPb3c7d2eyWm5Ji2qXR0ICTbYwOnRj41tVMDkIy2+EWnWbw Wsn5QN8J72/ik2OYu8ZRWgC4SWi7CH9BbbxCw91jF7Y+LyWsbGzYg9LULX/AqACKsxjV B6QZ4A3V505/IOFA8+SyhOvpi+X7Aa2t2H98NVP6Vfacmjlfl60Pp3OiLIwImRuosYm+ qeF56T5AhUpdHwN8PGdBlR7hMndMQaIXxga638Q4BXWW9DnmONA1X4yyMLaq/RlhkeS2 NUB3M3FZuasREPIT7/dWxC59LGnTxGIW/HfwmUaOPIErwBli83lfm19EsdSQtgl/9kNs d5zQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date:dkim-signature; bh=Rbq2eJ7KZq0RzLXpjgW/ZcfZFRzoMUskGtK7fK2K5/g=; b=op6+kRntfZnRoXNhyHxfkhucgJBi6610ReP2zt2EmComEJd1iiuCKgUkU/lac8K3E5 HHrOaXOJxc/DATKkXUCl7Tg+9khfdPoC55B6TgFSNv3UNYRSqWHYEq5UO7JU8/VpK8/N T+iF7ImgK3MCma2pa3U/sqBp9X+5TlaPzl7VOFQEEgmiKXTXZFAlRMDxXvMwXEg66UZD FLf1sro/2DsH0Q41W7xou7RQrJ1ENYWBpxTpAAGeuFySdlnfUa3ew60mbGu9Kkj1sY7J titePMR/y3dmXn9BsbqX90d9LkVSh51Dtrfz2CBbcUJ4n5PKRbmd1cIjkDV7T2/+Ci11 HOqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@netronome-com.20150623.gappssmtp.com header.s=20150623 header.b=CZaii7gc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w9si2946488plz.346.2019.08.15.18.12.23; Thu, 15 Aug 2019 18:12:39 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@netronome-com.20150623.gappssmtp.com header.s=20150623 header.b=CZaii7gc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726390AbfHPBLr (ORCPT + 99 others); Thu, 15 Aug 2019 21:11:47 -0400 Received: from mail-qk1-f196.google.com ([209.85.222.196]:35275 "EHLO mail-qk1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726163AbfHPBLq (ORCPT ); Thu, 15 Aug 2019 21:11:46 -0400 Received: by mail-qk1-f196.google.com with SMTP id r21so3490466qke.2 for ; Thu, 15 Aug 2019 18:11:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=Rbq2eJ7KZq0RzLXpjgW/ZcfZFRzoMUskGtK7fK2K5/g=; b=CZaii7gcMhA9ZeQo9LX9swlMehFen1b/QndvEsnqnWrH9arEWtW09Gi5GDXTBU17kE vj8vpWBKMc6riL+e1obdBk8//rCKbi5Psxxyb7J5FoGLrq1aYXCAJVnaX36Ks7ySzRAi Q5V2uYIdiI3c2J6WujNycfK60lb66/CMRsyLNIUlkDfo2AoZg+hZpdUdGzo2/wIfckIg bv4QnCDyNMgwtmvM+EmwCfooU3m0ttGdywy2HjakjLDog9f4Bn0sT6xqVG/m9cvux8em E+fJoHd7BPzlJdhVwdlMwokV9L7MFEKymsP+3o6gXpnKll2IsX3x+kiFJRVgRXWFLFaf obNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=Rbq2eJ7KZq0RzLXpjgW/ZcfZFRzoMUskGtK7fK2K5/g=; b=Xys9ttXmrPI+v5GAB7YCjr4SRwLp1u/6Zjyi6gitC5MKVtQMo4PZ8thwFvhMSsNX1L Xf9jPqiAK86JDO76akPuqDcBJTBrhCMHGLOlRDiNTjPBeTyKIDNPhUcPNDBZACN23Zq0 H6f9uYMTrMD5/Chsskmpjjwut+LJSDYbaZKusuRaqPeuc2ccZIiA5Gw19/zQzh6vIwxZ 9avv4gYpM+4U5LliunsrJjnuGHLs+R1t03pH6txJyBmnoB93aRyhM7lSW2r23j5w8Fl7 M+ioKdG3Yo0CQwv4zVeV10dxxA67qIhhXe/Rn0nyAZuTf0vmNdsMlRgH1bROXzqydMA0 8yUQ== X-Gm-Message-State: APjAAAWtnheChOS92J86jWadEQb8hBYH6GYqob9eqk/wAJiueQ3yWHmr MJ1VfkoTsWg48Ka5yKcsnNb7vg== X-Received: by 2002:a05:620a:691:: with SMTP id f17mr6883617qkh.470.1565917905610; Thu, 15 Aug 2019 18:11:45 -0700 (PDT) Received: from cakuba.netronome.com ([66.60.152.14]) by smtp.gmail.com with ESMTPSA id v81sm2390151qkb.21.2019.08.15.18.11.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Aug 2019 18:11:45 -0700 (PDT) Date: Thu, 15 Aug 2019 18:11:29 -0700 From: Jakub Kicinski To: Hillf Danton Cc: syzbot , ast@kernel.org, aviadye@mellanox.com, borisp@mellanox.com, bpf@vger.kernel.org, daniel@iogearbox.net, davejwatson@fb.com, davem@davemloft.net, john.fastabend@gmail.com, kafai@fb.com, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, songliubraving@fb.com, syzkaller-bugs@googlegroups.com, yhs@fb.com Subject: Re: INFO: task hung in tls_sw_release_resources_tx Message-ID: <20190815181129.561cef8f@cakuba.netronome.com> In-Reply-To: <20190815141419.15036-1-hdanton@sina.com> References: <20190815141419.15036-1-hdanton@sina.com> Organization: Netronome Systems, Ltd. MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 15 Aug 2019 22:14:19 +0800, Hillf Danton wrote: > On Thu, 15 Aug 2019 03:54:06 -0700 > > Hello, > > > > syzbot found the following crash on: > > > > HEAD commit: 6d5afe20 sctp: fix memleak in sctp_send_reset_streams > > git tree: net > > console output: https://syzkaller.appspot.com/x/log.txt?x=16e5536a600000 > > kernel config: https://syzkaller.appspot.com/x/.config?x=a4c9e9f08e9e8960 > > dashboard link: https://syzkaller.appspot.com/bug?extid=6a9ff159672dfbb41c95 > > compiler: gcc (GCC) 9.0.0 20181231 (experimental) > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=17cb0502600000 > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=14d5dc22600000 > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit: > > Reported-by: syzbot+6a9ff159672dfbb41c95@syzkaller.appspotmail.com > > > > INFO: task syz-executor153:10198 blocked for more than 143 seconds. > > Not tainted 5.3.0-rc3+ #162 > > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > syz-executor153 D27672 10198 10179 0x80000002 > > Call Trace: > > context_switch kernel/sched/core.c:3254 [inline] > > __schedule+0x755/0x1580 kernel/sched/core.c:3880 > > schedule+0xa8/0x270 kernel/sched/core.c:3944 > > schedule_timeout+0x717/0xc50 kernel/time/timer.c:1783 > > do_wait_for_common kernel/sched/completion.c:83 [inline] > > __wait_for_common kernel/sched/completion.c:104 [inline] > > wait_for_common kernel/sched/completion.c:115 [inline] > > wait_for_completion+0x29c/0x440 kernel/sched/completion.c:136 > > crypto_wait_req include/linux/crypto.h:685 [inline] > > crypto_wait_req include/linux/crypto.h:680 [inline] > > tls_sw_release_resources_tx+0x4ee/0x6b0 net/tls/tls_sw.c:2075 > > tls_sk_proto_cleanup net/tls/tls_main.c:275 [inline] > > tls_sk_proto_close+0x686/0x970 net/tls/tls_main.c:305 > > inet_release+0xed/0x200 net/ipv4/af_inet.c:427 > > inet6_release+0x53/0x80 net/ipv6/af_inet6.c:470 > > __sock_release+0xce/0x280 net/socket.c:590 > > sock_close+0x1e/0x30 net/socket.c:1268 > > __fput+0x2ff/0x890 fs/file_table.c:280 > > ____fput+0x16/0x20 fs/file_table.c:313 > > task_work_run+0x145/0x1c0 kernel/task_work.c:113 > > exit_task_work include/linux/task_work.h:22 [inline] > > do_exit+0x92f/0x2e50 kernel/exit.c:879 > > do_group_exit+0x135/0x360 kernel/exit.c:983 > > __do_sys_exit_group kernel/exit.c:994 [inline] > > __se_sys_exit_group kernel/exit.c:992 [inline] > > __x64_sys_exit_group+0x44/0x50 kernel/exit.c:992 > > do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296 > > entry_SYSCALL_64_after_hwframe+0x49/0xbe > > RIP: 0033:0x43ff88 > > Code: 00 00 be 3c 00 00 00 eb 19 66 0f 1f 84 00 00 00 00 00 48 89 d7 89 f0 > > 0f 05 48 3d 00 f0 ff ff 77 21 f4 48 89 d7 44 89 c0 0f 05 <48> 3d 00 f0 ff > > ff 76 e0 f7 d8 64 41 89 01 eb d8 0f 1f 84 00 00 00 > > RSP: 002b:00007ffd1c2d0f78 EFLAGS: 00000246 ORIG_RAX: 00000000000000e7 > > RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000000000043ff88 > > RDX: 0000000000000000 RSI: 000000000000003c RDI: 0000000000000000 > > RBP: 00000000004bf890 R08: 00000000000000e7 R09: ffffffffffffffd0 > > R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 > > R13: 00000000006d1180 R14: 0000000000000000 R15: 0000000000000000 > > INFO: lockdep is turned off. > > NMI backtrace for cpu 0 > > CPU: 0 PID: 1057 Comm: khungtaskd Not tainted 5.3.0-rc3+ #162 > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS > > Google 01/01/2011 > > Call Trace: > > __dump_stack lib/dump_stack.c:77 [inline] > > dump_stack+0x172/0x1f0 lib/dump_stack.c:113 > > nmi_cpu_backtrace.cold+0x70/0xb2 lib/nmi_backtrace.c:101 > > nmi_trigger_cpumask_backtrace+0x23b/0x28b lib/nmi_backtrace.c:62 > > arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38 > > trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline] > > check_hung_uninterruptible_tasks kernel/hung_task.c:205 [inline] > > watchdog+0x9d0/0xef0 kernel/hung_task.c:289 > > kthread+0x361/0x430 kernel/kthread.c:255 > > ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352 > > Sending NMI from CPU 0 to CPUs 1: > > NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0xe/0x10 > > arch/x86/include/asm/irqflags.h:60 > > 1, diff -> commit f87e62d45e51 -> commit 1023121375c6 > > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -2167,11 +2167,13 @@ static void tx_work_handler(struct work_ > return; > > ctx = tls_sw_ctx_tx(tls_ctx); > - if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) > - return; > - > - if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) > - return; > + if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) { > + if (!test_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) > + return; > + } else { > + if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) > + return; > + } > lock_sock(sk); > tls_tx_records(sk, -1); > release_sock(sk); > -- > > 2, a simpler one. And clear BIT_TX_SCHEDULED perhaps after releasing sock. > > --- a/net/tls/tls_sw.c > +++ b/net/tls/tls_sw.c > @@ -2167,11 +2167,9 @@ static void tx_work_handler(struct work_ > return; > > ctx = tls_sw_ctx_tx(tls_ctx); > - if (test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) > - return; > + if (!test_bit(BIT_TX_CLOSING, &ctx->tx_bitmask)) > + clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask); > > - if (!test_and_clear_bit(BIT_TX_SCHEDULED, &ctx->tx_bitmask)) > - return; > lock_sock(sk); > tls_tx_records(sk, -1); > release_sock(sk); Mmm.. too terse, I don't follow what you're trying to do here :( I've been staring at this for a while and trying to repo but it's not happening here. The only thing I see is that EBUSY is not handled.