Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp1386849rwr; Wed, 26 Apr 2023 14:22:12 -0700 (PDT) X-Google-Smtp-Source: AKy350b7Dj3K8VLX8JQHiTmLJEF1T7CIP0Qq5uP7oY5U+8Z0mjV+6V1uRDyQlZUNbVUbDnMy2095 X-Received: by 2002:a05:6a20:4323:b0:f2:895c:ccc9 with SMTP id h35-20020a056a20432300b000f2895cccc9mr24834552pzk.45.1682544131587; Wed, 26 Apr 2023 14:22:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682544131; cv=none; d=google.com; s=arc-20160816; b=I0K3UCSPS11L9IdKtdkmtbO4dPL9OKQqXm5JLXdrjQ2KIGIplkjLc7kJw0/UkVWr0j rl+tdEsNKJaZiL4hnDFjUdPjAGt9bNx0WCYRl8k0NUUxfACE3WNmOL2O1NWM0+eChf/d 0PDC2Lj9kv+4FYB4dEyoJqC/dYq+tFm8kRzv6EM12jkrGJ15OApe03ec9VG0Dig2vBjg v5FcOLdmIWvXbEDsuLrbuPR2lm0KNnAp8PHueUzZzwg/xgiNq1Co+c5bPMHY/0kb8z9h N2FRVKVrAleBDHRCPczBOQyZ1oZq8cKc8upYmP3kSDoEU/A06CkWV/3LkJ8Y/31jAxI/ S3dw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=55x0fsH7rWuifbwYOvRxVUBSnriaW/68yE6sFRxrOpw=; b=0F9siSMnfFBcmzVIUyRCHLksjH5M/W/xxHpkBQJrY25M9YsxLdHPrmLva4OgPzm3Xi nxgcDEXaykU5WcMndjOtu5YpuuF7qItNCyPybJmY5mf6DaQtuPNDQH+4Wniac599Rr9P 2Dg+oaoDfslmK2LLe7f7loreV9tV+S9VGsih5HhkMRxwcIw4VWrvqCJD8jYGvfteGzLr gNRNK9u1t7g6DBEbCVN7f3FZEimsy/vuMfSItDeXriyBrMgzXm6RnZ0zOJmtVmJ5iWgg loazaVK+NzfWvaKh0JE/8zm2GKE0qGZUPYAc4MjFa57sGToqIRJ0e5MO1QCVYPCi6jGA 9LEw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cr96P0qg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk13-20020a056a02028d00b0051b54dccff6si15776121pgb.727.2023.04.26.14.21.55; Wed, 26 Apr 2023 14:22:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=cr96P0qg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239458AbjDZVRI (ORCPT + 99 others); Wed, 26 Apr 2023 17:17:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229889AbjDZVRF (ORCPT ); Wed, 26 Apr 2023 17:17:05 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 297BF26A6; Wed, 26 Apr 2023 14:17:05 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B65E263441; Wed, 26 Apr 2023 21:17:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1B7D5C433D2; Wed, 26 Apr 2023 21:17:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1682543824; bh=pyR8etJ8lQqzGGZ35EU50KdUAefCugLRG9HniYy5mT8=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=cr96P0qg5DUaul3Dh3Pl/GhoQZ8LxBQ+Zhj66onz5Snbhz1JWnw/aVmUp6i147ASZ sT0ErMJhtibZumxamsaPbg2KaUu60uhPUyry+BYp2vwiLRbu8WjnjNyPGl7slc8j6H 6hHM9IZ8LPLP8cTXJcQulsmHSmWIcTtj/pVYENvbOJnMlRPBVA7An+rumhDt68KOhc ycoJ/nn2MrmF8u/2xzlCrQrezm5r0LRQJhqkgGSaSobsJhqkOG2NX1Uei28WMcXmIt D0iJLYqkdb3r5VVWXKbj5N6MYhrs4uyfXcuXRmHCpZ7cBW72Uz2qQ1YVhBj8auzwgp RQVqtcxrarJUQ== Received: by paulmck-ThinkPad-P72.home (Postfix, from userid 1000) id A3B1A15404B7; Wed, 26 Apr 2023 14:17:03 -0700 (PDT) Date: Wed, 26 Apr 2023 14:17:03 -0700 From: "Paul E. McKenney" To: Tejun Heo Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, rostedt@goodmis.org, riel@surriel.com Subject: Re: [PATCH RFC rcu] Stop rcu_tasks_invoke_cbs() from using never-online CPUs Message-ID: <1713f8f6-88d6-41f1-bbc6-045b2e017289@paulmck-laptop> Reply-To: paulmck@kernel.org References: <83d037d1-ef12-4b31-a7b9-7b1ed6c3ae42@paulmck-laptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 26, 2023 at 09:55:55AM -1000, Tejun Heo wrote: > Hello, Paul. > > On Wed, Apr 26, 2023 at 10:26:38AM -0700, Paul E. McKenney wrote: > > The rcu_tasks_invoke_cbs() relies on queue_work_on() to silently fall > > back to WORK_CPU_UNBOUND when the specified CPU is offline. However, > > the queue_work_on() function's silent fallback mechanism relies on that > > CPU having been online at some time in the past. When queue_work_on() > > is passed a CPU that has never been online, workqueue lockups ensue, > > which can be bad for your kernel's general health and well-being. > > > > This commit therefore checks whether a given CPU is currently online, > > and, if not substitutes WORK_CPU_UNBOUND in the subsequent call to > > queue_work_on(). Why not simply omit the queue_work_on() call entirely? > > Because this function is flooding callback-invocation notifications > > to all CPUs, and must deal with possibilities that include a sparse > > cpu_possible_mask. > > > > Fixes: d363f833c6d88 rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations > > Reported-by: Tejun Heo > > Signed-off-by: Paul E. McKenney > > I don't understand the code at all but wonder whether it can do sth similar > to cpumask_any_distribute() which RR's through the specified cpumask. Would > it make sense to change rcu_tasks_invoke_cbs() to do something similar > against cpu_online_mask? It might. But the idea here is to spread the load of queueing the work as well as spreading the load of invoking the callbacks. I suppose that I could allocate an array of ints, gather the online CPUs into that array, and do a power-of-two distribution across that array. But RCU Tasks allows CPUs to go offline with queued callbacks, so this array would also need to include those CPUs as well as the ones that are online. Given that the common-case system has a dense cpus_online_mask, I opted to keep it simple, which is optimal in the common case. Or am I missing a trick here? Thanx, Paul