Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp2725038ybb; Fri, 27 Mar 2020 10:54:32 -0700 (PDT) X-Google-Smtp-Source: ADFU+vtgWtlXO23pUUOiKyoUqVqw8bSI4dLAkmRz19tcUWnbeN9iUDTzUYFTcReQhhejHKyATwuN X-Received: by 2002:a9d:6c89:: with SMTP id c9mr6168466otr.337.1585331672258; Fri, 27 Mar 2020 10:54:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585331672; cv=none; d=google.com; s=arc-20160816; b=VxDILp2VA3pwi3AaNP1W+q5Ju+tFJ/9Rp643kH8JfleTmed9cDkOBbARBwYpcHfJuL Q48FSikrFE42gJFZHU6A/MIi3pmT/mRyhOBhBrkpG3quRmZs0wE8Hv2iqL3HqA9DwNpN 5nDWGGFju/CJ4WFEAf/D2zfw6pkAjH+0GaE0eV6OXssfXq1Brav0YRCqC2WxLugcKRfp Bd4RbaozJRQyuxW4StYk4eiS0p2K4SnRIS7ya+YJbPwromOKYPY15v+fytEMFGy5BKZc +lwHArpwmK/8yEWvPGOoOVIeGOAdzloQPYAJ4HJG4WHMNHPh29OMyQ8Iqn4OW71BkC3P AC8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=GtApLKc2wO51W0E7EznapO+iEui3cq8DETE5kW8QJr4=; b=xHAzC3dKqx6c7ICzuV52RSWzfmrcsZsyCyiHZq5fF7lsRoCR7UdZ75cCFYJmz9W82D sgRJ9rJo1F0F+MPEZ3rl4pCxoQHVPs2070rPwaVYxFwznkBGkAdy86L05bTO+JYmDAWc aHopMcnHiKNeIyQ2p6cYX+eCvvclCAErB+8bl07Gz2MC7vMkXlW8IdwcifS34r4pFClE m3RE0Ha7CPzHSAvEjoKqJI3/tWMjISy12QQRCNHFq0CNzKSp4tzXnQ2RduY+PFwdOm5d LOsCJE7irQSC/XSewJTQ/44gf3zVUTS/UxAJ8FaQsXPgyILkzxKJb6xpjRqh3gt7qZyH PS9g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 109si2860243otv.36.2020.03.27.10.54.19; Fri, 27 Mar 2020 10:54:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727423AbgC0Rx7 (ORCPT + 99 others); Fri, 27 Mar 2020 13:53:59 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:54506 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726275AbgC0Rx7 (ORCPT ); Fri, 27 Mar 2020 13:53:59 -0400 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1jHtB0-0003Kn-Qs; Fri, 27 Mar 2020 18:53:50 +0100 Date: Fri, 27 Mar 2020 18:53:50 +0100 From: Sebastian Andrzej Siewior To: kernel test robot Cc: Thomas Gleixner , Ingo Molnar , "Peter Zijlstra (Intel)" , linux-kernel@vger.kernel.org, LKP , Tejun Heo , Lai Jiangshan Subject: [PATCH] workqueue: Don't double assign worker->sleeping Message-ID: <20200327175350.rw5gex6cwum3ohnu@linutronix.de> References: <20200327074308.GY11705@shao2-debian> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200327074308.GY11705@shao2-debian> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The kernel test robot triggered a warning with the following race: task-ctx interrupt-ctx worker -> process_one_work() -> work_item() -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> ->sleeping = 1 atomic_dec_and_test(nr_running) __schedule(); *interrupt* async_page_fault() -> local_irq_enable(); -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> if (WARN_ON(->sleeping)) return -> __schedule() -> sched_update_worker() -> wq_worker_running() -> atomic_inc(nr_running); -> ->sleeping = 0; -> sched_update_worker() -> wq_worker_running() if (!->sleeping) return In this context the warning is pointless everything is fine. However, if the interrupt occurs in wq_worker_sleeping() between reading and setting `sleeping' i.e. | if (WARN_ON_ONCE(worker->sleeping)) | return; *interrupt* | worker->sleeping = 1; then pool->nr_running will be decremented twice in wq_worker_sleeping() but it will be incremented only once in wq_worker_running(). Replace the assignment of `sleeping' with a cmpxchg_local() to ensure that there is no double assignment of the variable. The variable is only accessed from the local CPU. Remove the WARN statement because this condition can be valid. An alternative would be to move `->sleeping' to `->flags' as a new bit but this would require to acquire the pool->lock in wq_worker_running(). Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock") Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian Reported-by: kernel test robot Signed-off-by: Sebastian Andrzej Siewior --- kernel/workqueue.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4e01c448b4b48..dc477a2a3ce30 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -846,11 +846,10 @@ void wq_worker_running(struct task_struct *task) { struct worker *worker = kthread_data(task); - if (!worker->sleeping) + if (cmpxchg_local(&worker->sleeping, 1, 0) == 0) return; if (!(worker->flags & WORKER_NOT_RUNNING)) atomic_inc(&worker->pool->nr_running); - worker->sleeping = 0; } /** @@ -875,10 +874,9 @@ void wq_worker_sleeping(struct task_struct *task) pool = worker->pool; - if (WARN_ON_ONCE(worker->sleeping)) + if (cmpxchg_local(&worker->sleeping, 0, 1) == 1) return; - worker->sleeping = 1; spin_lock_irq(&pool->lock); /* -- 2.26.0