Received: by 2002:a05:7412:bbc7:b0:fc:a2b0:25d7 with SMTP id kh7csp1477810rdb; Sat, 3 Feb 2024 07:48:19 -0800 (PST) X-Google-Smtp-Source: AGHT+IFQWh4tPPgEr08TpVADW+b46DuwtY+RzyFbJCNBbmY1yoMEvCm+wmHbqIcdhU46F6J7JAGi X-Received: by 2002:a17:902:c404:b0:1d9:6d99:10c9 with SMTP id k4-20020a170902c40400b001d96d9910c9mr7162724plk.44.1706975298993; Sat, 03 Feb 2024 07:48:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706975298; cv=pass; d=google.com; s=arc-20160816; b=C/nsizqncg+hFtV9i4RPlfbllV4g1/c29I0QkeYO/BQDDej52fhn1VX2e6sqO8m/0q KGvuASrZW4aFcKfyDy7hpZQQ89axSpLScbpsiu6OIlLCaDk+wyAVacpkBrck1ZaCzFBM E0qQ0o3214RRmyPmo6P2KNMybIcDMhSmbrOazVGonBFOfyvIQHu7Kr2cauhSlVyS6L9J 1v67EJxX0hDJk2jupYBY3btUwfowh+31heTpWydxOqLjZG64dwcPK9OiRXW/Z4uwAlwT +NNCX2XVTHoMoc/7CXuCSe7HwJ9E2VIpooRiA7IYkHo4QS9rO7E2G9ftb3LZz+UHAQS2 vp7g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=xqI95iFKnrWkwe8wbz452OkEw/9Fkjl9mFJelsPsD1I=; fh=SRykW610butLbEoEpd0I9OU0L/fj0kr2FZnUTJ4Gw8g=; b=QIr/vcNClOVvPLeP2DqD2FwiRyZVO7hMrlzcMqnzLpdMyk3aS3CawJVvW3RDRVTuTN CzMIZh+8aBM0mo6G1SKzCNIYVz6wwgv3yTHQnrFiscAeQE7KBOYa2s04+M3TeE0xOSkQ x5Pv8i2luyfrHCrgMbhFjZp4VR1ZFZBva53b9J/9R0Is3274b0RrshdsK/s1RUy8DPLJ 39Vzs56Knj8PsvRBlSaRpRvn/DK4vlueqADcsHPLro2mPF12lyLjB0mvQqNlbe5HyDSr CCC12Hk/WYrliKpcZoB4YmiZVcN5rlf3GGajRLGtiN2nR4sSkkvjJuL0Jv9WGpqz2LB1 kC4g==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TQUXn5fm; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-51157-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-51157-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=1; AJvYcCUyPA8BEY9M+lshpuCbuR8lF5PGAzAi5o/h1Bbvsafwa1kFGCe/Nl+cF7uRb6Uptkz7uowUZOhRqcUafGshhqPsQO1ssBYMBFp2dmWByQ== Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id jg1-20020a17090326c100b001d6f7875f69si3393288plb.380.2024.02.03.07.48.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 03 Feb 2024 07:48:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-51157-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TQUXn5fm; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-51157-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-51157-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id B3B9EB2226D for ; Sat, 3 Feb 2024 15:44:32 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 791C95F855; Sat, 3 Feb 2024 15:44:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TQUXn5fm" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B002A5F549 for ; Sat, 3 Feb 2024 15:44:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706975046; cv=none; b=tCWziBbzkfYvwIuTht4sJxbzDeD+Y6g0OERN8uHty7oq5T0y8iTo25xM3OBxpm9k0cMSAnuQJhzb6o210p97eFJGDRVZ2yBriDRmDppR1qtUkhucz4LfUGGtq4Y5btDkA4uQysjyqn2mYrDWkx9ITKhQhuAc66y1UJjfYuBThJc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706975046; c=relaxed/simple; bh=P0fDczSqjm+4VoEPMcjEIqr0fOERQ7HRV0xJuzSw7Sw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=apeEqKu66IDlxeMuio3AsxscXrv+LZGsrQohiP5PfUpzFcOTLAg4JOcyVmcIB3iVj3KJ7whMtlrQx8bv8wgpSsrt+0hUE66CW0xlyNYGS91EYTqHZ+YK4tB0AD/9K3990lZ2V7Sa5UYN03aM7tTPHmQ0CD5KxfxV9setd2rvwtk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TQUXn5fm; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706975043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xqI95iFKnrWkwe8wbz452OkEw/9Fkjl9mFJelsPsD1I=; b=TQUXn5fmTUy20KWzZKT077IM+kL4649wFqBTa1kFro1Ns0i7bAUyeuE+75x5lMHcMYBLFD rpkC2icXiFk6yF4/CePTZcJuKhWModAKQo2RgvUOFuzcbWA2f5H5GZa6um7LtB1NaeSEHG FOsWi23RKpuuddD4cg5Ci/QnI2s3cDE= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-134-yHwX9ymmOlquW_3aGoaEYQ-1; Sat, 03 Feb 2024 10:44:00 -0500 X-MC-Unique: yHwX9ymmOlquW_3aGoaEYQ-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C2467383DBE0; Sat, 3 Feb 2024 15:43:59 +0000 (UTC) Received: from llong.com (unknown [10.22.32.36]) by smtp.corp.redhat.com (Postfix) with ESMTP id 496A6492BC6; Sat, 3 Feb 2024 15:43:59 +0000 (UTC) From: Waiman Long To: Tejun Heo , Lai Jiangshan Cc: linux-kernel@vger.kernel.org, Juri Lelli , Cestmir Kalina , Alex Gladkov , Phil Auld , Costa Shulyupin , Waiman Long Subject: [PATCH-wq v2 2/5] workqueue: Enable unbound cpumask update on ordered workqueues Date: Sat, 3 Feb 2024 10:43:31 -0500 Message-Id: <20240203154334.791910-3-longman@redhat.com> In-Reply-To: <20240203154334.791910-1-longman@redhat.com> References: <20240203154334.791910-1-longman@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.9 Ordered workqueues does not currently follow changes made to the global unbound cpumask because per-pool workqueue changes may break the ordering guarantee. IOW, a work function in an ordered workqueue may run on an isolated CPU. This patch enables ordered workqueues to follow changes made to the global unbound cpumask by temporaily freeze the newly allocated pool_workqueue by using the new frozen flag to freeze execution of newly queued work items until the old pwq has been properly flushed. This enables ordered workqueues to follow the unbound cpumask changes like other unbound workqueues at the expense of some delay in execution of work functions during the transition period. Signed-off-by: Waiman Long --- kernel/workqueue.c | 93 +++++++++++++++++++++++++++++++++++++++------- 1 file changed, 80 insertions(+), 13 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 7ef393f4012e..f089e532758a 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -242,6 +242,7 @@ struct pool_workqueue { int refcnt; /* L: reference count */ int nr_in_flight[WORK_NR_COLORS]; /* L: nr of in_flight works */ + int frozen; /* L: temporarily frozen */ /* * nr_active management and WORK_STRUCT_INACTIVE: @@ -1667,6 +1668,9 @@ static bool pwq_tryinc_nr_active(struct pool_workqueue *pwq, bool fill) lockdep_assert_held(&pool->lock); + if (pwq->frozen) + return false; + if (!nna) { /* per-cpu workqueue, pwq->nr_active is sufficient */ obtained = pwq->nr_active < READ_ONCE(wq->max_active); @@ -1747,6 +1751,21 @@ static bool pwq_activate_first_inactive(struct pool_workqueue *pwq, bool fill) } } +/** + * thaw_pwq - thaw a frozen pool_workqueue + * @pwq: pool_workqueue to be thawed + */ +static void thaw_pwq(struct pool_workqueue *pwq) +{ + unsigned long flags; + + raw_spin_lock_irqsave(&pwq->pool->lock, flags); + pwq->frozen = false; + if (pwq_activate_first_inactive(pwq, true)) + kick_pool(pwq->pool); + raw_spin_unlock_irqrestore(&pwq->pool->lock, flags); +} + /** * node_activate_pending_pwq - Activate a pending pwq on a wq_node_nr_active * @nna: wq_node_nr_active to activate a pending pwq for @@ -4595,6 +4614,14 @@ static void pwq_release_workfn(struct kthread_work *work) mutex_lock(&wq->mutex); list_del_rcu(&pwq->pwqs_node); is_last = list_empty(&wq->pwqs); + + /* + * For ordered workqueue with a frozen dfl_pwq, thaw it now. + */ + if (!is_last && (wq->flags & __WQ_ORDERED_EXPLICIT) && + wq->dfl_pwq->frozen) + thaw_pwq(wq->dfl_pwq); + mutex_unlock(&wq->mutex); } @@ -4758,10 +4785,30 @@ static void apply_wqattrs_cleanup(struct apply_wqattrs_ctx *ctx) { if (ctx) { int cpu; + bool refcheck = false; for_each_possible_cpu(cpu) put_pwq_unlocked(ctx->pwq_tbl[cpu]); + + /* + * For ordered workqueue with a frozen dfl_pwq and a reference + * count of 1 in ctx->dfl_pwq, it is highly likely that the + * refcnt will become 0 after the final put_pwq(). Acquire + * wq->mutex to ensure that the pwq won't be freed by + * pwq_release_workfn() when we check pwq later. + */ + if ((ctx->wq->flags & __WQ_ORDERED_EXPLICIT) && + ctx->wq->dfl_pwq->frozen && + (ctx->dfl_pwq->refcnt == 1)) { + mutex_lock(&ctx->wq->mutex); + refcheck = true; + } put_pwq_unlocked(ctx->dfl_pwq); + if (refcheck) { + if (!ctx->dfl_pwq->refcnt) + thaw_pwq(ctx->wq->dfl_pwq); + mutex_unlock(&ctx->wq->mutex); + } free_workqueue_attrs(ctx->attrs); @@ -4821,6 +4868,15 @@ apply_wqattrs_prepare(struct workqueue_struct *wq, cpumask_copy(new_attrs->__pod_cpumask, new_attrs->cpumask); ctx->attrs = new_attrs; + /* + * For initialized ordered workqueues, there is only one pwq (dfl_pwq). + * Temporarily the frozen flag of ctx->dfl_pwq to freeze the execution + * of newly queued work items until execution of older work items in + * the old pwq has completed. + */ + if (!list_empty(&wq->pwqs) && (wq->flags & __WQ_ORDERED_EXPLICIT)) + ctx->dfl_pwq->frozen = true; + ctx->wq = wq; return ctx; @@ -4861,13 +4917,8 @@ static int apply_workqueue_attrs_locked(struct workqueue_struct *wq, if (WARN_ON(!(wq->flags & WQ_UNBOUND))) return -EINVAL; - /* creating multiple pwqs breaks ordering guarantee */ - if (!list_empty(&wq->pwqs)) { - if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT)) - return -EINVAL; - + if (!list_empty(&wq->pwqs) && !(wq->flags & __WQ_ORDERED_EXPLICIT)) wq->flags &= ~__WQ_ORDERED; - } ctx = apply_wqattrs_prepare(wq, attrs, wq_unbound_cpumask); if (IS_ERR(ctx)) @@ -6316,11 +6367,28 @@ static int workqueue_apply_unbound_cpumask(const cpumask_var_t unbound_cpumask) if (!(wq->flags & WQ_UNBOUND) || (wq->flags & __WQ_DESTROYING)) continue; - /* creating multiple pwqs breaks ordering guarantee */ + /* + * We does not support changing cpumask of an ordered workqueue + * again before the previous cpumask change is completed. + * Sleep up to 100ms in 10ms interval to allow previous + * operation to complete and skip it if not done by then. + */ if (!list_empty(&wq->pwqs)) { - if (wq->flags & __WQ_ORDERED_EXPLICIT) - continue; - wq->flags &= ~__WQ_ORDERED; + struct pool_workqueue *pwq = wq->dfl_pwq; + + if (!(wq->flags & __WQ_ORDERED_EXPLICIT)) { + wq->flags &= ~__WQ_ORDERED; + } else if (pwq && pwq->frozen) { + int i; + + for (i = 0; i < 10; i++) { + msleep(10); + if (!pwq->frozen) + break; + } + if (WARN_ON_ONCE(pwq->frozen)) + continue; + } } ctx = apply_wqattrs_prepare(wq, wq->unbound_attrs, unbound_cpumask); @@ -6836,9 +6904,8 @@ int workqueue_sysfs_register(struct workqueue_struct *wq) int ret; /* - * Adjusting max_active or creating new pwqs by applying - * attributes breaks ordering guarantee. Disallow exposing ordered - * workqueues. + * Adjusting max_active breaks ordering guarantee. Disallow exposing + * ordered workqueues. */ if (WARN_ON(wq->flags & __WQ_ORDERED_EXPLICIT)) return -EINVAL; -- 2.39.3