Received: by 2002:ac0:e34a:0:0:0:0:0 with SMTP id g10csp66577imn; Tue, 26 Jul 2022 22:58:53 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t8R5R1SWA+xR2r6tViN4jZdJwSzuu56pNousYQ7Eh67k5AVgw5Oryzw8pHQZQpCHVJFsLI X-Received: by 2002:a05:6402:16:b0:43a:f435:5d07 with SMTP id d22-20020a056402001600b0043af4355d07mr21514024edu.420.1658901533155; Tue, 26 Jul 2022 22:58:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1658901533; cv=none; d=google.com; s=arc-20160816; b=h31MGub8x6TtqOzgpIpc64TCk2E/Wv0eZPoGrAtqz5TKpZQ+S37SN9M2wfRcl4+yAp BVBrWj+f3ONFzVd9tJ2lUoFD9AVPiZnl5lJDDRu/N7OYAvjAn0NbpIRsuEOFZW6nYrBI Z+uA31inGf5m9eAX5CvnrDArgbOnyHvDuJAi4tie0N5uh1ks1ww944hoBE2dOrRZ/8z0 PyeTAgkWm8LTwwIbRH56dRRZBr6/iuwqBBAWwqf0pDcNAqfO8SMQCkqpQ+V9Zx5X5fE+ UO+yxdZYqEi8PYEAL9G0JQ9Zc5zjVLncaHCRf6+ts7Nqya+b2amZOVJdOiR1Bc+wQQEO LZtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=HMameTnBAoiarUgSLVWyT4MKNj8tsJRNE7nQMvYzCBM=; b=H7t2MbzSVbI10MM+F59qkGmfwauF2EbixnhpqZlRSpVjV7FFzUcnLbhqvDIjebUwmx IYyFg6OLgDlmeZ8TEyGDXROPfjJ+okt0p9gdBeo4c63jnFo2U9Xer/Vyu7s+W2qAkv2T sTwS6Z387pEIf3VN2Ufj+As04sjgRtHJ9GI3U9qzmQ7zQVOqgNaSRYIjQfbiAFqQx0AW zC4X5PP307ZRRI31rv+iDuJijYy8ehMPD/iwoX/GpUjjc2N0nCb6CgNOkb01FLAxgBff 8CvPmrdZAkNrTXt1V1AIDFOM88LNrgiV1lgpHG7FqHgYy2z8yoKrpPJVJNUY19q2bztB ZHXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=SFXy23Rf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g10-20020a1709065d0a00b006f3916bcfc8si21114566ejt.142.2022.07.26.22.58.26; Tue, 26 Jul 2022 22:58:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=SFXy23Rf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231132AbiG0FjF (ORCPT + 99 others); Wed, 27 Jul 2022 01:39:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43206 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230173AbiG0FjE (ORCPT ); Wed, 27 Jul 2022 01:39:04 -0400 Received: from mail-wm1-x336.google.com (mail-wm1-x336.google.com [IPv6:2a00:1450:4864:20::336]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E2073B95B for ; Tue, 26 Jul 2022 22:39:03 -0700 (PDT) Received: by mail-wm1-x336.google.com with SMTP id x23-20020a05600c179700b003a30e3e7989so449129wmo.0 for ; Tue, 26 Jul 2022 22:39:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HMameTnBAoiarUgSLVWyT4MKNj8tsJRNE7nQMvYzCBM=; b=SFXy23Rf+rLEeDnruHAsTdl4Jh/zlg6QJfcdGhSUQk/AKIsHuHBpUPlByPTjyNxfJL AEQB6WKOteYpWeA+awAy3xX2nJfrazfT9wEqrn1vyPf5oHXSh+J8vLuI4OGjSNTl4eb7 46wY/RU3z3y30ZQBqDP/hrOo4lgoFDAlCd8v6e/ihbtKtzFp8Rz5W30m/1aBepODvth4 brSgvJvnvUheCwFj43+d6aN4PdBO9QGZcXITMG80BH56M13mtjwdFf0rOuT1jhNHJ9i6 +IvwdsO82VoVD0nZnteatM1K1nLUKjBziD944NNCulUvtCelqfcqXLtax4GYfFewZdcB TnvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HMameTnBAoiarUgSLVWyT4MKNj8tsJRNE7nQMvYzCBM=; b=ViP00sEhazGNpUJSwPkv36E/L4uIG9vwLika3ZyjebyogO4BhjiOQ9qf3x9ucFC1e+ 7UCdMDrjIcvveAYpnNpkoMen9o1PB2x6plJA6U4WO3Ine5a4A405Ud8VxaZvftGtxPLB fElNeS+jbs5+8qw9akE/J4CAmeoAsU8wvPpP+fnLec5hjr83M2UpLlgIdqy5EZKOL/S7 7Fk8J0PYNPC1w1UNPJkmdhb8Drawg+5fwOvzsLIRkZ7XnpXkY7dRmSMKKS6aDSB9Z3Zw /+TtqgYrsNya9yDRjKpYlDCbY0XUWQ3sceLUDYSU49hWi6I8ON9JfgSPLXrO9IK+w8qW Vclg== X-Gm-Message-State: AJIora9+7oxo5QSXp4EIHlI+qOnVQ+ICcQkXG2GhuIZTs+UlIgNIr25C qnZ5tyn/+npkG3vw3QV1lEajIkzaoLj13QfgCRU= X-Received: by 2002:a05:600c:3847:b0:3a3:5333:8bcd with SMTP id s7-20020a05600c384700b003a353338bcdmr1632919wmr.36.1658900341745; Tue, 26 Jul 2022 22:39:01 -0700 (PDT) MIME-Version: 1.0 References: <20220719165743.3409313-1-vschneid@redhat.com> In-Reply-To: From: Lai Jiangshan Date: Wed, 27 Jul 2022 13:38:49 +0800 Message-ID: Subject: Re: [RFC PATCH] workqueue: Unbind workers before sending them to exit() To: Valentin Schneider Cc: Tejun Heo , LKML , Peter Zijlstra , Frederic Weisbecker , Juri Lelli , Phil Auld , Marcelo Tosatti Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 27, 2022 at 4:36 AM Valentin Schneider wrote: > > On 26/07/22 07:30, Tejun Heo wrote: > > Hello, > > > > On Mon, Jul 25, 2022 at 11:21:37AM +0100, Valentin Schneider wrote: > >> Hm so my choice of words in the changelog wasn't great - "initial setup" > >> can be kernel init, but *also* setup of whatever workload is being deployed > >> onto the system. > >> > >> So you can be having "normal" background activity (I've seen some IRQs end > >> up with schedule_work() on isolated CPUs, they're not moved away at boot > >> time but rather shortly before launching the latency-sensitive app), some > >> preliminary stats collection / setup to make sure the CPU will be quiet > >> (e.g. refresh_vm_stats()), and *then* the application starts with > >> fresh-but-no-longer-required extra pcpu kworkers assigned to its CPU. > > > > Ah, I see. I guess we'll need to figure out how to unbind the workers then. > > > > I've been playing with different ways to unbind & wake the workers in a > sleepable context, but so far I haven't been happy with any of my > experiments. I'm writing code to handle the problems of cpu affinity and prematurely waking up of newly created worker. This work of unbinding the dying worker is also on the list. I haven't figured out a good solution. I was planning to add set_cpus_allowed_ptr_off_rq() which only set cpumasks to the task only if it is sleeping and returns -EBUSY otherwise. And it is ensured and documented as being usable in an atomic context and it is recommended to be used for dying tasks only. I can't really ensure it would be implemented as I'm expecting since it touches scheduler code. I'd better back off. > > What hasn't changed much between my attempts is transferring to-be-destroyed > kworkers from their pool->idle_list to a reaper_list which is walked by > *something* that does unbind+wakeup. AFAIA as long as the kworker is off > the pool->idle_list we can play with it (i.e. unbind+wake) off the > pool->lock. > > It's the *something* that's annoying to get right, I don't want it to be > overly complicated given most users are probably not impacted by what I'm > trying to fix, but I'm getting the feeling it should still be a per-pool > kthread. I toyed with a single reaper kthread but a central synchronization > for all the pools feels like a stupid overhead. I think fixing it in the workqueue.c is complicated. Nevertheless, I will also try to fix it inside workqueue only to see what will come up. > > If any of that sounds ludicrous please shout, otherwise I'm going to keep > tinkering :) > > > Thanks. > > > > -- > > tejun >