Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3778946ybi; Mon, 29 Jul 2019 12:27:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqxJ5ePjuEDyjNzolW1Er5J1V/ziqXj42hDneLi18tHw+X1uT23+vqiVLIuGmFTrVPMRtZzB X-Received: by 2002:a17:902:2929:: with SMTP id g38mr91138422plb.163.1564428472434; Mon, 29 Jul 2019 12:27:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564428472; cv=none; d=google.com; s=arc-20160816; b=LcveEwlAFqe344Atege3BRwTs0sHa0EQSyLkOlYk45gYIdZYDmt2Wu/Ouo7K20epfB xJMRBADXZIjEg9vnyDIp6vPLyn6ls6qQP60p5ZO1mwDqv30T0xjw+VmR5d5hAZDxROJZ QShcvRi3KjV8v4yqr2pKzGgNUn50/huOFguiFtO2B9jqkc+wUcU/2oUk5lGZwQR2U8cZ pgZiDnfggWw/WDEIN5aEBgd0Ga7JVK3iEbXYNd1k1RIxITp26rdWMMn3KTaAhrgzUEe4 WfXKEsYvq1QpCyJA/Dsd22rnnrL/GUZEMVhsV1kYvM6ueZXHW2e2AF+ukGYOIGLhFmrK nAyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:to:from :dkim-signature; bh=iMzsSmyHefLZA/P1kofSAbiRT9d3nYNPP5ssDRjmA50=; b=0TTUuNSvZ/siOXGhPQLUaglXGE8QT/gj5q/3OreyjBfFQdNfq+1EjkI/NpdYzvvSGG HE9dOe/Vz7OQZ4QXLCr83Zitv3AJVXajhrZMdQ2WFl8mwc2zptnw8Trv467gwBBuXeQk SrRTQman/56XOqWrN21EhC6bLjpsy3rJY48piz39SxqfBgcGn7aUGTo2C780As0DseiY foEpkoj39sWvOVldu1JFhNn8qZKF7KohmMzm5IGOPZTTrwG58hrdl2WgouDg4q9hYjqS zHSgVW7zWNinZGMRmp76v8/gGl36fCV0FZolLAEvHkKc2cIqqoEJGUFQpGeEz/cyMSOF pfkw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@schoebel-theuer.de header.s=strato-dkim-0002 header.b=PGWox2ge; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id g2si25116830plp.1.2019.07.29.12.27.37; Mon, 29 Jul 2019 12:27:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@schoebel-theuer.de header.s=strato-dkim-0002 header.b=PGWox2ge; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728558AbfG2RUc (ORCPT + 99 others); Mon, 29 Jul 2019 13:20:32 -0400 Received: from mo4-p00-ob.smtp.rzone.de ([81.169.146.162]:31787 "EHLO mo4-p00-ob.smtp.rzone.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725934AbfG2RUb (ORCPT ); Mon, 29 Jul 2019 13:20:31 -0400 X-Greylist: delayed 360 seconds by postgrey-1.27 at vger.kernel.org; Mon, 29 Jul 2019 13:20:29 EDT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; t=1564420829; s=strato-dkim-0002; d=schoebel-theuer.de; h=Message-Id:Date:Subject:To:From:X-RZG-CLASS-ID:X-RZG-AUTH:From: Subject:Sender; bh=iMzsSmyHefLZA/P1kofSAbiRT9d3nYNPP5ssDRjmA50=; b=PGWox2geMSB96NAXjV52cVSfmOigpRa6mZcEBjQodbgmdZz61b5BsdN3cK0PGG2t09 RDAY0gcM+mRkv/gq/qXRjvt3H7dpqbOf4RP0aAzaHvpfsXs8UXj0h96yB3Z7bDBRf8C2 2TK9WyyokEDSAJEtfo/Rh+LN4d69GlRjVpYx2eg8t52eM267QG3/QZ9OuBnTkw01ECdU Yk3EtxPFscX4Lr4VDTGSNw3KftJadtQ7HwWk7JCSRzcx3sibA2uvOZwEBVddrxPl2kbi VPASB413dFAisrKCf/9rdsGqM52VpKv6SFxDusDxNzznnZi+SIOcZZX3CaPO3oyYg4U7 AjuA== X-RZG-AUTH: ":OH8QVVOrc/CP6za/qRmbF3BWedPGA1vjs2e0bDjfg8SjapJoMy/ngEsCKWYOdqxseLQOewnkIF1PDSXJ" X-RZG-CLASS-ID: mo00 Received: from schoebel-theuer.de by smtp.strato.de (RZmta 44.24 DYNA|AUTH) with ESMTPSA id e0059dv6THERHNu (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA (curve secp521r1 with 521 ECDH bits, eq. 15360 bits RSA)) (Client did not present a certificate); Mon, 29 Jul 2019 19:14:27 +0200 (CEST) From: Thomas Schoebel-Theuer To: Ingo Molnar , Peter Zijlstra , linux-kernel@vger.kernel.org (open list:SCHEDULER) Subject: [PATCH] sched/wait: fix endless kthread loop at timeout Date: Mon, 29 Jul 2019 19:14:26 +0200 Message-Id: <20190729171427.6234-1-tst@schoebel-theuer.de> X-Mailer: git-send-email 2.12.3 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Thomas Schoebel-Theuer Scenario, possible since kernel 4.11.x and later: 1) kthread calls a waiting function with a timeout, and blocks. 2) kthread_stop() is called by somebody else. 3) The waiting condition does not change for a long time. 4) Nothing happens => normally the timeout would be reached by the kthread. However, the && in wait_woken() now prevents any call to schedule_timeout(). As a consequence, the timeout value will never be decreased, resulting not only in never reaching the timeout, but also in an endless loop, burning the CPU in kernel mode. This fix ensures the following semantics: kthread_should_stop() is treated as equivalent to a timeout. This is beneficial because most users do not want to wait for the timeout, but to stop the kthread as soon as possible. It appears that this semantics was probably intended (otherwise the check is_kthread_should_stop() would not make much sense), but just went wrong due to the bug. Here is an example, triggered by external kernel module MARS on a production kernel. However, the problem can be triggered by other kthreads and on newer kernels, and also in very different scenarios, not only during tcp_revcmsg(). In the following example, the kthread simply waits for network packets to arrive, but in the test scenario the network had been blocked underneath by a firewall rule in order to trigger the bug: Mar 08 07:40:08 icpu5133 kernel: watchdog: BUG: soft lockup - CPU#29 stuck for 23s! [mars_receiver8.:8139] Mar 08 07:40:08 icpu5133 kernel: Modules linked in: mars(-) ip6table_mangle ip6table_raw iptable_raw ip_set_bitmap_port xt_DSCP xt_multiport ip_set_hash_ip xt_own Mar 08 07:40:08 icpu5133 kernel: irq event stamp: 300719885 Mar 08 07:40:08 icpu5133 kernel: hardirqs last enabled at (300719883): [] _raw_spin_unlock_irqrestore+0x3d/0x4f Mar 08 07:40:08 icpu5133 kernel: hardirqs last disabled at (300719885): [] apic_timer_interrupt+0x82/0x90 Mar 08 07:40:08 icpu5133 kernel: softirqs last enabled at (300719878): [] lock_sock_nested+0x50/0x98 Mar 08 07:40:08 icpu5133 kernel: softirqs last disabled at (300719884): [] release_sock+0x16/0xda Mar 08 07:40:08 icpu5133 kernel: CPU: 29 PID: 8139 Comm: mars_receiver8. Not tainted 4.14.104+ #121 Mar 08 07:40:08 icpu5133 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.5.5 08/16/2017 Mar 08 07:40:08 icpu5133 kernel: task: ffff88bf82764fc0 task.stack: ffffc90012430000 Mar 08 07:40:08 icpu5133 kernel: RIP: 0010:arch_local_irq_restore+0x2/0x8 Mar 08 07:40:08 icpu5133 kernel: RSP: 0018:ffffc90012433b78 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 Mar 08 07:40:08 icpu5133 kernel: RAX: 0000000000000000 RBX: ffff88bf82764fc0 RCX: 00000000fec792b4 Mar 08 07:40:08 icpu5133 kernel: RDX: 00000000c18b50d3 RSI: 0000000000000000 RDI: 0000000000000246 Mar 08 07:40:08 icpu5133 kernel: RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 Mar 08 07:40:08 icpu5133 kernel: R10: ffffc90012433b08 R11: ffffc90012433ba8 R12: 0000000000000246 Mar 08 07:40:08 icpu5133 kernel: R13: ffffffff819df735 R14: 0000000000000001 R15: ffff88bf82765818 Mar 08 07:40:08 icpu5133 kernel: FS: 0000000000000000(0000) GS:ffff88c05fb80000(0000) knlGS:0000000000000000 Mar 08 07:40:08 icpu5133 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Mar 08 07:40:08 icpu5133 kernel: CR2: 000055abd12eb688 CR3: 000000000241e006 CR4: 00000000001606e0 Mar 08 07:40:08 icpu5133 kernel: Call Trace: Mar 08 07:40:08 icpu5133 kernel: lock_release+0x32f/0x33b Mar 08 07:40:08 icpu5133 kernel: release_sock+0x90/0xda Mar 08 07:40:08 icpu5133 kernel: sk_wait_data+0x7f/0x13f Mar 08 07:40:08 icpu5133 kernel: ? prepare_to_wait_exclusive+0xc1/0xc1 Mar 08 07:40:08 icpu5133 kernel: tcp_recvmsg+0x4e6/0x91a Mar 08 07:40:08 icpu5133 kernel: ? flush_signals+0x2b/0x6a Mar 08 07:40:08 icpu5133 kernel: ? lock_acquire+0x20a/0x25a Mar 08 07:40:08 icpu5133 kernel: inet_recvmsg+0x8d/0xc0 Mar 08 07:40:08 icpu5133 kernel: kernel_recvmsg+0x8f/0xaa Mar 08 07:40:08 icpu5133 kernel: ? ___might_sleep+0xf2/0x256 Mar 08 07:40:08 icpu5133 kernel: mars_recv_raw+0x22a/0x4da [mars] Mar 08 07:40:08 icpu5133 kernel: desc_recv_struct+0x40/0x375 [mars] Mar 08 07:40:08 icpu5133 kernel: receiver_thread+0xa2/0x61a [mars] Mar 08 07:40:08 icpu5133 kernel: ? _hash_insert+0x160/0x160 [mars] Mar 08 07:40:08 icpu5133 kernel: ? kthread+0x1a6/0x1ae Mar 08 07:40:08 icpu5133 kernel: kthread+0x1a6/0x1ae Mar 08 07:40:08 icpu5133 kernel: ? __list_del_entry+0x60/0x60 Mar 08 07:40:08 icpu5133 kernel: ret_from_fork+0x3a/0x50 Mar 08 07:40:08 icpu5133 kernel: Code: ee e8 c5 17 00 00 48 85 db 75 0e 31 f6 48 c7 c7 c0 5f 53 82 e8 68 b9 58 00 48 89 5b 58 58 5b 5d c3 9c 58 0f 1f 44 00 00 c3 Signed-off-by: Thomas Schoebel-Theuer --- kernel/sched/wait.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c index c1e566a114ca..08f121154a91 100644 --- a/kernel/sched/wait.c +++ b/kernel/sched/wait.c @@ -412,8 +412,15 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout) * or woken_wake_function() sees our store to current->state. */ set_current_state(mode); /* A */ - if (!(wq_entry->flags & WQ_FLAG_WOKEN) && !is_kthread_should_stop()) - timeout = schedule_timeout(timeout); + if (!(wq_entry->flags & WQ_FLAG_WOKEN)) { + /* + * Treat kthread stopping as equivalent to a timeout. + */ + if (is_kthread_should_stop()) + timeout = 0; + else + timeout = schedule_timeout(timeout); + } __set_current_state(TASK_RUNNING); /* -- 2.12.3