Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp432146pxa; Wed, 19 Aug 2020 05:41:17 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyC6t5Pf9BxZwEH8Zdlhf34MLx66fAU8AZs/LpNNzQVWdi4bU77tUfRv9DKuca/BSB22Fqp X-Received: by 2002:a17:906:7e0b:: with SMTP id e11mr26329118ejr.540.1597840877024; Wed, 19 Aug 2020 05:41:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597840877; cv=none; d=google.com; s=arc-20160816; b=AWxiLDhygZ060c0k4z1lo9aZ3nJ8ay0KK7y4fpH+p2xkbYkNEvyNW8fvBxHMDjJHkd dfB/sfraIgSnLdpe8AtDmlz6ZL9DHHsGLIN48XuWH4tb0XHlJ/b/PfFAN1lSo7dEkFHs gnohJptFmokaLXE4uWIqZ8l27ufUj5BEThgSY0G9rzc6nMFZ+PhlR8q4JQJyhtlE96vy /Bbnug4p6oH3KH/wr3AAhP+zeS4G6RuAu3WZMSthcBpN9yKaEloRcUPZC8/f75XUwtgn GK84xLcOMUqIaWdSzGe40P8w0pebx1/rrtnooON1cYkkNzhx0k3tlZeEK2tMTlVY3p20 f9Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-disposition:mime-version :message-id:subject:cc:to:from:dkim-signature:dkim-signature:date; bh=lwW59TWfLQL9/3bZCi/CgmH2rjvRhAnceBj2jd3SReo=; b=RYFpbLXElw7l+6SEQs4dy+PmVqXvFIzFioKtMziARNtMAz99B3sfGzEXFg0awg3eXN e2Axby0bR8slm/CK0+ySZgU/4T9gUNUSoG5jGevAONBP/ZKM4tFVcePEEXpnkcRMGEb2 FMxazpDufdy4l0mXHpEF845rjIULPluIDYwbdj8H/UJpgvE5/gkqroPo01ZrxI/QPT5z hpeQNwiTHAykJXCJK/2t8/EEa9G8SxE0rXdmW35SB7GDzgLIIwsL5i4+1OjsHQnLVgns A5obwWJ4p9LzS6uXI8UX902QMyBJ/belMnnRw+mql7JllK2TLMXY+y7FjHZtoIdZIgT4 M5eQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=imGxWJhr; dkim=neutral (no key) header.i=@vger.kernel.org; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rp21si15614704ejb.0.2020.08.19.05.40.53; Wed, 19 Aug 2020 05:41:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=imGxWJhr; dkim=neutral (no key) header.i=@vger.kernel.org; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728300AbgHSMiI (ORCPT + 99 others); Wed, 19 Aug 2020 08:38:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728249AbgHSMiB (ORCPT ); Wed, 19 Aug 2020 08:38:01 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53D3CC061757; Wed, 19 Aug 2020 05:38:01 -0700 (PDT) Date: Wed, 19 Aug 2020 14:37:58 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1597840679; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=lwW59TWfLQL9/3bZCi/CgmH2rjvRhAnceBj2jd3SReo=; b=imGxWJhrhkFrnQ6z3tR+Tn958XvlBhli27vWGVVlvnb5U7529LWEvoBmnFzBrt9h+KvJzb J/Mi16BTNBaP0RDORRxDLgpxddkXb5gD+rmjgvE2WIrTdPdl4cfYYEZ/9QzGi7IQZ1FYMM XLzWdkNdAHqgU7t/6dZaHSTT3veW0NELHlDBrdVD++uBEW6+Puo6LPhvtFwOMLcWk2oAuC Zf1OB9ZrA8yFytDcs/+GHiWmniFMZMMIJkANU4Pd7D4JA05/OjLdNtd+KB6TLGHQRkx9bY L7Gbmdxj1Dh5CM+oK32cn0Ak6bA9oEQ1grOtb+lgIgQwe0aI8OORV+NWMyrV7w== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1597840679; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=lwW59TWfLQL9/3bZCi/CgmH2rjvRhAnceBj2jd3SReo=; b=L96aS3EG2/dopYg65zffBoCQs2OJ3KlrwOAI5sFEBi2GymemUk0tDk5n9DAPYbj0sZWHE1 U8+k+5QFGSljGJCg== From: Sebastian Andrzej Siewior To: linux-kernel@vger.kernel.org, io-uring@vger.kernel.org Cc: Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Jens Axboe , Thomas Gleixner Subject: [RFC PATCH] sched: Invoke io_wq_worker_sleeping() with enabled preemption Message-ID: <20200819123758.6v45rj2gvojddsnn@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org During a context switch the scheduler invokes wq_worker_sleeping() with disabled preemption. Disabling preemption is needed because it protects access to `worker->sleeping'. As an optimisation it avoids invoking schedule() within the schedule path as part of possible wake up (thus preempt_enable_no_resched() afterwards). The io-wq has been added to the mix in the same section with disabled preemption. This breaks on PREEMPT_RT because io_wq_worker_sleeping() acquires a spinlock_t. Also within the schedule() the spinlock_t must be acquired after tsk_is_pi_blocked() otherwise it will block on the sleeping lock again while scheduling out. While playing with `io_uring-bench' I didn't notice a significant latency spike after converting io_wqe::lock to a raw_spinlock_t. The latency was more or less the same. I don't see a significant reason why this lock should become a raw_spinlock_t therefore I suggest to move it after the tsk_is_pi_blocked() check. The io_worker::flags are usually modified under the lock except in the scheduler path. Ideally the lock is always acquired since the IO_WORKER_F_UP flag is set early in the startup and IO_WORKER_F_RUNNING should be set unless the task loops within schedule(). I *think* ::flags requires the same protection like workqueue's ::sleeping and therefore I move the check within the locked section. Any feedback on this vs raw_spinlock_t? Signed-off-by: Sebastian Andrzej Siewior --- fs/io-wq.c | 8 ++++---- kernel/sched/core.c | 10 +++++----- 2 files changed, 9 insertions(+), 9 deletions(-) diff --git a/fs/io-wq.c b/fs/io-wq.c index e92c4724480ca..a7e07b3ac5b95 100644 --- a/fs/io-wq.c +++ b/fs/io-wq.c @@ -623,15 +623,15 @@ void io_wq_worker_sleeping(struct task_struct *tsk) struct io_worker *worker = kthread_data(tsk); struct io_wqe *wqe = worker->wqe; + spin_lock_irq(&wqe->lock); if (!(worker->flags & IO_WORKER_F_UP)) - return; + goto out; if (!(worker->flags & IO_WORKER_F_RUNNING)) - return; + goto out; worker->flags &= ~IO_WORKER_F_RUNNING; - - spin_lock_irq(&wqe->lock); io_wqe_dec_running(wqe, worker); +out: spin_unlock_irq(&wqe->lock); } diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 3bbb60b97c73c..b76c0f27bd95e 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4694,18 +4694,18 @@ static inline void sched_submit_work(struct task_struct *tsk) * in the possible wakeup of a kworker and because wq_worker_sleeping() * requires it. */ - if (tsk->flags & (PF_WQ_WORKER | PF_IO_WORKER)) { + if (tsk->flags & PF_WQ_WORKER) { preempt_disable(); - if (tsk->flags & PF_WQ_WORKER) - wq_worker_sleeping(tsk); - else - io_wq_worker_sleeping(tsk); + wq_worker_sleeping(tsk); preempt_enable_no_resched(); } if (tsk_is_pi_blocked(tsk)) return; + if (tsk->flags & PF_IO_WORKER) + io_wq_worker_sleeping(tsk); + /* * If we are going to sleep and we have plugged IO queued, * make sure to submit it to avoid deadlocks. -- 2.28.0