Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp80755ybb; Fri, 27 Mar 2020 16:32:44 -0700 (PDT) X-Google-Smtp-Source: ADFU+vuZ4JDjfTyVqDBwq49zYaCsMPUBZ66iysfvmNUGc+uQH4G07O9vu+NS080604gvBvo8Zvks X-Received: by 2002:a05:6830:3150:: with SMTP id c16mr827394ots.251.1585351964357; Fri, 27 Mar 2020 16:32:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1585351964; cv=none; d=google.com; s=arc-20160816; b=IZw9hGV3lyowLMY4E5t+x3FWheZFlDvT+713qm8hXROLvkYbkiaU3OGtB4R9HbJtU7 b653BBqKHEcrq5UbVD2bFT+2anCnLHPBvS+9fUfDuEpfKtFVVigNjLkfYJDakd5Uiwiw hzlpbELj2zgtEiE6p9D+V/2D08wmE8wMmFE7wvS+Ymx2sDnneuZ0HpQVgiuEp00lCUqR xsiLx6Li02Vf8cen+bGCrFapZMpj0jUNZZjh+LQaV6KJwEs0oDL+KkaWSIJCuMaU81Nh sSLV0N8ge0JCQ8Z8C1lmM/Mkvb/mHwOCzM26BWnhnoFqLzwK91yyb7yWEWWETJ5IV3HY NPUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=saww5/aGZLy4rud1oEq3kxIMPJM2SVzb02KB3ULsbbM=; b=V8Q1sWzb20WHR3YGx0Umq2S7f5WuInfEHUCINvHHAxqX9ghGm/Y2KvonTygLhWdo4S kQQtq1bPPNV/G1q3Mo9belBdAy1Y7YsgIN1/5Ql0Vsje2mzHIafj2Fw9KVzd7QHODtgv 3ph4ZelDu+qMoM9tK7FM0WfHDcPXddR/CnlZHDbyA/rxjjxAqSkAivq21FJmFR+jdE2D 81bX6crdP488cd6wvmdAJPXu/WypWGipY4xgvEkN8+wS0cdAopH4rZISJc4+JfnrFeuT uXNU5f/HsWGpSCTtwLkl/30mcxe/SjWGh5Z3CeC2/NSwHZvLQKbmd+gGLrh5FL9NW+ci UiJw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z23si2864165oos.54.2020.03.27.16.32.31; Fri, 27 Mar 2020 16:32:44 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727126AbgC0XaI convert rfc822-to-8bit (ORCPT + 99 others); Fri, 27 Mar 2020 19:30:08 -0400 Received: from Galois.linutronix.de ([193.142.43.55]:55088 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726065AbgC0XaI (ORCPT ); Fri, 27 Mar 2020 19:30:08 -0400 Received: from bigeasy by Galois.linutronix.de with local (Exim 4.80) (envelope-from ) id 1jHyQJ-0007UJ-Mc; Sat, 28 Mar 2020 00:30:00 +0100 Date: Sat, 28 Mar 2020 00:29:59 +0100 From: Sebastian Andrzej Siewior To: kernel test robot Cc: Thomas Gleixner , Ingo Molnar , "Peter Zijlstra (Intel)" , linux-kernel@vger.kernel.org, LKP , Tejun Heo , Lai Jiangshan Subject: [PATCH v2] workqueue: Remove the warning in wq_worker_sleeping() Message-ID: <20200327232959.rpylymw2edhtxuwr@linutronix.de> References: <20200327074308.GY11705@shao2-debian> <20200327175350.rw5gex6cwum3ohnu@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <20200327175350.rw5gex6cwum3ohnu@linutronix.de> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The kernel test robot triggered a warning with the following race: task-ctx A interrupt-ctx B worker -> process_one_work() -> work_item() -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> ->sleeping = 1 atomic_dec_and_test(nr_running) __schedule(); *interrupt* async_page_fault() -> local_irq_enable(); -> schedule(); -> sched_submit_work() -> wq_worker_sleeping() -> if (WARN_ON(->sleeping)) return -> __schedule() -> sched_update_worker() -> wq_worker_running() -> atomic_inc(nr_running); -> ->sleeping = 0; -> sched_update_worker() -> wq_worker_running() if (!->sleeping) return In this context the warning is pointless everything is fine. An interrupt before wq_worker_sleeping() will perform the ->sleeping assignment (0 -> 1 > 0) twice. An interrupt after wq_worker_sleeping() will trigger the warning and nr_running will be decremented (by A) and incremented once (only by B, A will skip it). This is the case until the ->sleeping is zeroed again in wq_worker_running(). Remove the WARN statement because this condition may happen. Document that preemption around wq_worker_sleeping() needs to be disabled to protect ->sleeping and not just as an optimisation. Fixes: 6d25be5782e48 ("sched/core, workqueues: Distangle worker accounting from rq lock") Link: https://lkml.kernel.org/r/20200327074308.GY11705@shao2-debian Reported-by: kernel test robot Signed-off-by: Sebastian Andrzej Siewior --- v1…v2: - Drop the warning instead of using cmpxchg_local(). Tglx pointed out that wq_worker_sleeping() is already invoked with disabled preemption so the race described can not happen. - Document that it is required to have preemption disabled within wq_worker_sleeping() to protect ->sleeping (against the race described in v1). kernel/sched/core.c | 3 ++- kernel/workqueue.c | 6 ++++-- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 1a9983da4408d..7181ea73e5566 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4112,7 +4112,8 @@ static inline void sched_submit_work(struct task_struct *tsk) * it wants to wake up a task to maintain concurrency. * As this function is called inside the schedule() context, * we disable preemption to avoid it calling schedule() again - * in the possible wakeup of a kworker. + * in the possible wakeup of a kworker and because wq_worker_sleeping() + * requires it. */ if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) { preempt_disable(); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4e01c448b4b48..39b968146f5f7 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -858,7 +858,8 @@ void wq_worker_running(struct task_struct *task) * @task: task going to sleep * * This function is called from schedule() when a busy worker is - * going to sleep. + * going to sleep. Preemption needs to be disabled to protect ->sleeping + * assignment. */ void wq_worker_sleeping(struct task_struct *task) { @@ -875,7 +876,8 @@ void wq_worker_sleeping(struct task_struct *task) pool = worker->pool; - if (WARN_ON_ONCE(worker->sleeping)) + /* Return if preempted before wq_worker_running() was reached */ + if (worker->sleeping) return; worker->sleeping = 1; -- 2.26.0