Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp3805922rdg; Wed, 18 Oct 2023 06:42:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGGyD9tAw4Q/HIKaqO+cBuIuCSCpdD0MdLhXSfTEpPNM3xCAc1NvKGelNud84z/p46PgaZy X-Received: by 2002:a17:902:fb87:b0:1ca:7a4c:834e with SMTP id lg7-20020a170902fb8700b001ca7a4c834emr4646318plb.69.1697636579241; Wed, 18 Oct 2023 06:42:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697636579; cv=none; d=google.com; s=arc-20160816; b=VGiN7a+CVWh80mJ0SOzR1bRf3qjwJdRToTw6bBvpd1sLW/PEUG2LUMnObQbLrvhZNw bXN1PS/G1ZMIShRDHBTFwXbcB4icW5Y8Wn8n0wRL1ct6intRjFHonIAQSFUGkplW6LKV aDciKQwh9byGm3lDUy3To7aPGKc07HJ8YWonfzQ5ndKeXkP4hMr6Bkl7oKUh/NPF7lbG y2VUjlqeUBdV5baMJkQsv+3VNdIqoNzN1dIlvECye7/FDfGB4hihzjXA7oqvHnsOJK40 EqaEbBHFM4XDXmi/SXS19LykRVv2Q3aHwYcbwODJB1H4ORuaawUiLimOauyQo+W/AGFn dr5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=ILBT+iSHEJYre2cM7vHpJtGYMUXtqABzFsja9O+e7DU=; fh=qHj29L2A5mwzu/PF4m4XcxAWuokRE3PlYhTqb6/cyNs=; b=bPEkqoLGwxIi9EcT1nnMVUUdnwBtBr41niHE0brLYT20XWHpOipb3XEljsnX6hPppP yKUvYY/C33iUgPLWiq+xPs5hZ3cCXSJ5IvrxpWQ5ex4ndibTHkgr00WZBUWYt9JJmtX7 30Avp0qya2Q3aEEKVqyL3s7ORXGg5CeFWmt5dyuHLp9fUwnaa/2g2869eAhIKfmLXK6h lfq/RmJw3FODODHkvqwsmX1V9N5FQ01A+SkuJ/C7KKRGPvmTS6JSGWcbL4/OYc9Nt7y3 6rboX0BSdfPd9poehEZ1rjMqMgi9FvxSQwyjyMuc9NiYzoBKNiOmnfJKAT5PYU7sqHzC fQlQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=C+rcDA6Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id f14-20020a170902ce8e00b001c4248c3f8bsi4469053plg.559.2023.10.18.06.42.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 06:42:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=C+rcDA6Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id B10488027784; Wed, 18 Oct 2023 06:42:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231741AbjJRNmq (ORCPT + 99 others); Wed, 18 Oct 2023 09:42:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231582AbjJRNmp (ORCPT ); Wed, 18 Oct 2023 09:42:45 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77B37EA for ; Wed, 18 Oct 2023 06:42:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1697636523; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ILBT+iSHEJYre2cM7vHpJtGYMUXtqABzFsja9O+e7DU=; b=C+rcDA6YqvuydU+cW3A7JlXXw2H1LfLrZC6QBfH3947S6dcQNAawFka9XIAFBb3eUJlWqH XUv6g9ohFreCeERgfQZr7ZMH3zdK8DpMSZ8p5QEauvxAWcpCZAsBer2conxB/ik3PTh9S8 XYkIiyhjcC6Vcp0VWRRTxJUjFLTeoeE= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-320-QMq3P2rQM2OaGDnv-n7yWQ-1; Wed, 18 Oct 2023 09:41:58 -0400 X-MC-Unique: QMq3P2rQM2OaGDnv-n7yWQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6DB8C87A9F9; Wed, 18 Oct 2023 13:41:57 +0000 (UTC) Received: from [10.22.17.22] (unknown [10.22.17.22]) by smtp.corp.redhat.com (Postfix) with ESMTP id 97AEE1121314; Wed, 18 Oct 2023 13:41:56 +0000 (UTC) Message-ID: <4e9cc6e3-7582-64af-76d7-6f9f72779146@redhat.com> Date: Wed, 18 Oct 2023 09:41:55 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH-cgroup 1/4] workqueue: Add workqueue_unbound_exclude_cpumask() to exclude CPUs from wq_unbound_cpumask Content-Language: en-US To: Tejun Heo Cc: Zefan Li , Johannes Weiner , Jonathan Corbet , Lai Jiangshan , Shuah Khan , cgroups@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org References: <20231013181122.3518610-1-longman@redhat.com> <20231013181122.3518610-2-longman@redhat.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-Spam-Status: No, score=-4.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 18 Oct 2023 06:42:56 -0700 (PDT) On 10/18/23 05:24, Tejun Heo wrote: > Hello, > > On Fri, Oct 13, 2023 at 02:11:19PM -0400, Waiman Long wrote: >> When the "isolcpus" boot command line option is used to add a set >> of isolated CPUs, those CPUs will be excluded automatically from >> wq_unbound_cpumask to avoid running work functions from unbound >> workqueues. >> >> Recently cpuset has been extended to allow the creation of partitions >> of isolated CPUs dynamically. To make it closer to the "isolcpus" >> in functionality, the CPUs in those isolated cpuset partitions should be >> excluded from wq_unbound_cpumask as well. This can be done currently by >> explicitly writing to the workqueue's cpumask sysfs file after creating >> the isolated partitions. However, this process can be error prone. >> Ideally, the cpuset code should be allowed to request the workqueue code >> to exclude those isolated CPUs from wq_unbound_cpumask so that this >> operation can be done automatically and the isolated CPUs will be returned >> back to wq_unbound_cpumask after the destructions of the isolated >> cpuset partitions. >> >> This patch adds a new workqueue_unbound_exclude_cpumask() to enable >> that. This new function will exclude the specified isolated CPUs >> from wq_unbound_cpumask. To be able to restore those isolated CPUs >> back after the destruction of isolated cpuset partitions, a new >> wq_user_unbound_cpumask is added to store the user provided unbound >> cpumask either from the boot command line options or from writing to >> the cpumask sysfs file. This new cpumask provides the basis for CPU >> exclusion. > The behaviors around wq_unbound_cpumask is getting pretty inconsistent: > > 1. Housekeeping excludes isolated CPUs on boot but allows user to override > it to include isolated CPUs afterwards. > > 2. If an unbound wq's cpumask doesn't have any intersection with > wq_unbound_cpumask we ignore the per-wq cpumask and falls back to > wq_unbound_cpumask. > > 3. You're adding a masking layer on top with exclude which fails to set if > the intersection is empty. > > Can we do the followings for consistency? > > 1. User's requested_unbound_cpumask is stored separately (as in this patch). > > 2. The effect wq_unbound_cpumask is determined by requested_unbound_cpumask > & housekeeping_cpumask & cpuset_allowed_cpumask. The operation order > matters. When an & operation yields an cpumask, the cpumask from the > previous step is the effective one. Sure. I will do that. > > 3. Expose these cpumasks in sysfs so that what's happening is obvious. I can expose the requested_unbound_cpumask. As for the isolated CPUs, see my other reply. >> +int workqueue_unbound_exclude_cpumask(cpumask_var_t exclude_cpumask) >> +{ >> + cpumask_var_t cpumask; >> + int ret = 0; >> + >> + if (!zalloc_cpumask_var(&cpumask, GFP_KERNEL)) >> + return -ENOMEM; >> + >> + /* >> + * The caller of this function may have called cpus_read_lock(), >> + * use cpus_read_trylock() to avoid potential deadlock. >> + */ >> + if (!cpus_read_trylock()) >> + return -EBUSY; > This means that a completely unrelated cpus_write_lock() can fail this > operation and thus cpuset config writes. Let's please not do this. Can't we > just make sure that the caller holds the lock? This condition is actually triggered by a few hotplug tests in test_cpuset_prs.sh. I will make sure that either cpu read or write lock is held before calling this function and eliminate rcu read locking here. > >> + mutex_lock(&wq_pool_mutex); >> + >> + if (!cpumask_andnot(cpumask, wq_user_unbound_cpumask, exclude_cpumask)) >> + ret = -EINVAL; /* The new cpumask can't be empty */ > For better or worse, the usual mode-of-failure for "no usable CPU" is just > falling back to something which works rather than failing the operation. > Let's follow that. In this case, it is just leaving the current unbound cpumask unchanged. I will follow the precedence discussed above to make sure that there is a graceful fallback. Cheers, Longman