Received: by 2002:a6b:fb09:0:0:0:0:0 with SMTP id h9csp4012912iog; Tue, 28 Jun 2022 07:21:47 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vxWoL5IK3B1cgBXGfYKr2htyYAsVfCkbj8RcvSKq9qF/wMKedBVmF3M8gGPH3+kDzQDJgL X-Received: by 2002:a17:902:e5cd:b0:16a:6f96:eb9 with SMTP id u13-20020a170902e5cd00b0016a6f960eb9mr4064163plf.69.1656426107393; Tue, 28 Jun 2022 07:21:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656426107; cv=none; d=google.com; s=arc-20160816; b=zye3udD83HDuoPRFF77svQWHO0hsQl+mEBVt2RuIrCr2dMPMaWnTCX9WZX58wHLpuq 5kfItQXQLyigVX+1pehtXkpXVDBTicKntRPpwU7/Ho4W/t0SGD3ZpHzf7sjTSgv3bSqy /MFllozUdsVboQSjeOWhKx9W3ITGWAFjoT4gdg/WLQrtUI21aEvTt4yDalv5gpO8yiW5 w0wEkOTfyIMEBu9NdnARxJTEernOTQ0Tk6tnLyEp+1/PCJNvthLfZO8ssmUZzox2AR0Z zW5Xg/yvcqavovLmYGJeYcsZnPyskB0TtbcI99lMo8mkLpgqlRxdtyJEaD4A+qHNBiaq Wzhw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=yIOQVr8+c2Z0dDen8MoAZ+0PaCXEPnuQdoMoSDEEnvw=; b=PPQxIhR2swGzqrnoRmSsxaJzcX12e8PR/B3O5f2TSSBv4Zp9NV++kvLrF04h17r4rl k68gqJBcO/YsNquDwoLrLuXZA859OiZjjBHOOUjFHyQv/hYvJIydjjTnpc+Vabxl0B4r 5QVNYx88l/+imDKmkaqeZ+nlDjjbCIaY0pc6kCkF5wVyxNhcrUmudTkRu0hCG0ULPF3a JaGHDLo1sHfEMh995FkpoOAzjnbFuu7PwhonE6xi0MxvJHqDPhfRwsRNLj5iyvlbZYAv vsdjODI+BxwGQHKyk5Um3TauuiXxsfUBIyZne+T6YjdNhaeBi8JxkA1cMnx0QGARrZLR iXSw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Uebpypmw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fy7-20020a17090b020700b001ece9122d3fsi21193369pjb.95.2022.06.28.07.21.27; Tue, 28 Jun 2022 07:21:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Uebpypmw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346075AbiF1ORL (ORCPT + 99 others); Tue, 28 Jun 2022 10:17:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1346222AbiF1ORI (ORCPT ); Tue, 28 Jun 2022 10:17:08 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1312D2F001 for ; Tue, 28 Jun 2022 07:17:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 739BE61A33 for ; Tue, 28 Jun 2022 14:17:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EACEAC341C6; Tue, 28 Jun 2022 14:16:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1656425822; bh=fL1myieNVAYQtnZgPBzowFEG66aMVu+/4jt+Cz8DC3U=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=UebpypmwDUMY84khT1esanXceJsfCfCdhnoGpECoHxJKyhhrineaIm1t85bLoreJr 686lvzF9fF94ESvxmoUzGWxvAALSrgsOVxS5uDt/niCmtnwFT02X0+KAQeDg5kJY6w T0viG1FnA1Wmqy9O/CcL3qSiFJHsuRzcvXKPLBtu7A31Tc8thh+2kilOCbsnaEhW/Y Dsf5QW+G+S/7/iJwPHsX9Q26gdgtzuC5PDQVde4Itix3e6XJSfIwyzAJrXAPdkpg0E 56KpMSrSouPcSSoNMSFuI37a0F37+84GUZXTJhLf2c/QcoEi8D9IaU9F5QklwpxPV4 +KDRfjNilOVGg== Date: Tue, 28 Jun 2022 16:16:56 +0200 From: Christian Brauner To: Linus Torvalds Cc: "Eric W. Biederman" , Tejun Heo , Petr Mladek , Lai Jiangshan , Michal Hocko , Linux Kernel Mailing List , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Andrew Morton , Oleg Nesterov Subject: Re: re. Spurious wakeup on a newly created kthread Message-ID: <20220628141656.cf2jyrelhcylkpfp@wittgenstein> References: <20220622140853.31383-1-pmladek@suse.com> <874k0863x8.fsf@email.froward.int.ebiederm.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-7.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jun 25, 2022 at 11:43:15AM -0700, Linus Torvalds wrote: > On Sat, Jun 25, 2022 at 11:25 AM Linus Torvalds > wrote: > > > > And that's not at all what the kthread code wants. It wants to set > > affinity masks, it wants to create a name for the thread, it wants to > > do all those other things. > > > > That code really wants to just do copy_process(). > > Honestly, I think kernel/kthread.c should be almost rewritten from scratch. > > I do not understand why it does all those odd keventd games at all, > and why kthread_create_info exists in the first place. > > Why does kthread_create() not just create the thread directly itself, > and instead does that odd queue it onto a work function? > > Some of that goes back to before the git history, and very little of > it seems to make any sense. It's as if the code is meant to be able to > run from interrupt context, but that can't be it: it's literally doing > a GFP_KERNEL kmalloc, it's doing spin-locks without irq safety etc. > > So why is it calling kthreadd_task() to create the thread? Purely for > some crazy odd "make that the parent" reason? > > I dunno. The code is odd, unexplained, looks buggy, and most fo the > reasons are probably entirely historical. > > I'm adding Christian to this thread too, since I get the feeling that > it really should be more tightly integrated with copy_process(), and > that Christian might have comments. > > Christian, see some context in the thread here: > > https://lore.kernel.org/all/CAHk-=wiC7rj1o7vTnYUPfD7YxAu09MZiZbahHqvLm9+Cgg1dFw@mail.gmail.com/ > > for some of this. Sorry, I was at LSS last week. I honestly didn't touch the code back then because it seemed almost entirely unrelated to regular task creation apart from kernel_thread() that I added. I didn't feel comfortable changing a lot of stuff there. Iirc, just a few months ago io_uring still made us of the kthread infrastructure and I think that made the limits of the interface more obvious. Now we soon will have two users that create a version of kernel generated threads with properties of another process (io_uring and [1]). In my head, the kthread infra should be able to support generation of pure kernel threads as well as the creation of users workers instead of adding specialized interfaces to do this. The fact that it doesn't is a limitation of the interface that imho shows it hasn't grown to adapt to the new use-cases we have. And imho we'll see more of those. In this context it's really worth looking at [1] because to some extent it duplicates bits we have for the kthread infra whereas I still think the kthread infra should support both possibly exposing two apis one to return pure kernel threads and the other returning struct user_worker or similar. Idk, it might just be a heat-stroke talking... I don't feel comfortable making strong assertions about the original implementation of kthreads. I wasn't around and there might be historical context I'm missing. One issue that Tejun also mentioned later in the thread and that we run into is that we have a pattern where we create a kthread and then trust the caller to handle/activate the new task. This is more problematic once we start supporting something like [1] where that's exposed to a driver. (Ideally creation of such a task would generate a unique callback - I think Peter suggested something like this? - that could only be used on that task...) [1]: https://lore.kernel.org/lkml/20220620011357.10646-1-michael.christie@oracle.com